The field of autonomous driving is rapidly advancing, with a focus on improving the ability of vehicles to understand and navigate complex driving scenarios. Researchers are exploring the use of multimodal large language models, which have shown great promise in enhancing the accuracy and efficiency of driving scenario perception technologies. One of the key challenges in this area is optimizing these models for specific tasks, such as cone detection, traffic light recognition, and intersection alerts. To address this challenge, researchers are developing new methods for dynamic prompt optimization, dataset construction, and model training. These advancements have the potential to significantly improve the safety and efficiency of autonomous vehicles. Noteworthy papers include: The paper on Research on Driving Scenario Technology Based on Multimodal Large Language Model Optimization, which proposes a comprehensive method for optimizing multimodal models in driving scenarios. The paper on Hierarchical Question-Answering for Driving Scene Understanding Using Vision-Language Models, which presents a hierarchical question-answering approach for scene understanding in autonomous vehicles, balancing cost-efficiency with detailed visual interpretation.