The field of autonomous driving is moving towards more personalized and human-centric approaches. Researchers are exploring ways to integrate human preferences and driving styles into end-to-end autonomous driving systems, enabling more comfortable and trustworthy interactions between humans and autonomous vehicles. A key area of focus is the development of large-scale datasets annotated with diverse driving preferences, which will facilitate the evaluation and improvement of personalized autonomous driving models. Additionally, there is a growing interest in self-supervised learning methods that can construct informative driving world models, enabling perception annotation-free and end-to-end planning. Another significant trend is the advancement of vision-language-action models, which are being improved through innovative tokenization techniques and large-scale datasets, leading to more efficient and reliable robotic control. Noteworthy papers in this area include:
- StyleDrive, which introduces a large-scale real-world dataset for personalized end-to-end autonomous driving and a benchmark for evaluating models.
- World4Drive, which presents an end-to-end autonomous driving framework that employs vision foundation models to build latent world models for generating and evaluating multi-modal planning trajectories.
- VQ-VLA, which proposes a vector quantization based action tokenizer that captures rich spatiotemporal dynamics and generates smoother action outputs.