Advancements in Autonomous Driving

The field of autonomous driving is rapidly advancing, with a focus on improving safety, trustworthiness, and generalization capabilities. Recent developments have explored the use of vision-language models (VLMs) to enhance driving decision-making, with applications in risk perception, driver attention, and scene understanding. Notably, researchers have introduced novel frameworks such as GraphPilot, which conditions language-based driving models on structured relational context, and VLA-R, an open-world end-to-end autonomous driving framework that integrates open-world perception with a novel vision-action retrieval paradigm. These innovations have shown significant improvements in driving performance, with some models achieving up to 15.6% increase in driving score. Furthermore, the development of benchmarks such as DSBench has highlighted the importance of evaluating VLMs' awareness of various safety risks in a unified manner. Overall, the field is moving towards more robust, interpretable, and generalizable autonomous driving systems. Noteworthy papers include GraphPilot, which achieved a 15.6% increase in driving score, and VLA-R, which demonstrated strong generalization and exploratory performance in unstructured environments.

Sources

Semantic VLM Dataset for Safe Autonomous Driving

GraphPilot: Grounded Scene Graph Conditioning for Language-Based Autonomous Driving

CADD: A Chinese Traffic Accident Dataset for Statute-Based Liability Attribution

LAVQA: A Latency-Aware Visual Question Answering Framework for Shared Autonomy in Self-Driving Vehicles

Real-Time Drivers' Drowsiness Detection and Analysis through Deep Learning

VLA-R: Vision-Language Action Retrieval toward Open-World End-to-End Autonomous Driving

Prompt-Driven Domain Adaptation for End-to-End Autonomous Driving via In-Context RL

FSDAM: Few-Shot Driving Attention Modeling via Vision-Language Coupling

Building Egocentric Procedural AI Assistant: Methods, Benchmarks, and Challenges

Descriptor: Distance-Annotated Traffic Perception Question Answering (DTPQA)

A Real-Time Driver Drowsiness Detection System Using MediaPipe and Eye Aspect Ratio

VLMs Guided Interpretable Decision Making for Autonomous Driving

Visionary Co-Driver: Enhancing Driver Perception of Potential Risks with LLM and HUD

Context-aware, Ante-hoc Explanations of Driving Behaviour

Abstract Scene Graphs: Formalizing and Monitoring Spatial Properties of Automated Driving Functions

Enhancing End-to-End Autonomous Driving with Risk Semantic Distillaion from VLM

Is Your VLM for Autonomous Driving Safety-Ready? A Comprehensive Benchmark for Evaluating External and In-Cabin Risks