Advancements in Autonomous Driving and Geo-Experience

The field of autonomous driving is witnessing a significant shift towards the integration of large language models (LLMs) and multi-modal perception fusion to enhance scenario comprehension and decision-making. This convergence is enabling the development of more sophisticated and human-like driving systems that can effectively interpret semantic information and discern intentions of other participants. Notably, researchers are exploring the use of LLMs to improve planning, navigation, and dynamic adaptation in autonomous driving, as well as in geo-experience applications such as travel planning and urban exploration. These advancements are leading to substantial improvements in route completion, navigation accuracy, and disruption resilience. Notable papers in this area include: LeAD, which presents a dual-rate autonomous driving architecture that integrates imitation learning-based end-to-end frameworks with LLM augmentation, achieving superior handling of unconventional scenarios. The User-Centric Geo-Experience proposes a framework that employs cooperative agents to resolve complex multi-modal user queries, provides fine-grained guidance, and detects and responds to trip plan disruptions. MoSE introduces a skill-oriented Mixture-of-Experts technique that mimics human drivers' learning process and reasoning process, skill-by-skill and step-by-step, achieving state-of-the-art performance with significantly reduced activated model size.

Sources

LeAD: The LLM Enhanced Planning System Converged with End-to-end Autonomous Driving

The User-Centric Geo-Experience: An LLM-Powered Framework for Enhanced Planning, Navigation, and Dynamic Adaptation

MoSE: Skill-by-Skill Mixture-of-Expert Learning for Autonomous Driving

Built with on top of