The field of 3D perception and autonomous driving is rapidly advancing, with a focus on improving the accuracy and efficiency of 3D object detection, tracking, and scene understanding. Recent developments have seen the integration of multi-modal features, such as camera and LiDAR data, to enhance the robustness and reliability of perception systems. Additionally, there is a growing interest in leveraging foundation models and attention mechanisms to improve the performance of 3D perception tasks. Noteworthy papers in this area include: Bridging Perspectives: Foundation Model Guided BEV Maps for 3D Object Detection and Tracking, which proposes a hybrid detection and tracking framework that incorporates both perspective-view and bird's-eye-view features. NV3D: Leveraging Spatial Shape Through Normal Vector-based 3D Object Detection, which utilizes local features acquired from voxel neighbors to determine the relationship between the surface and target entities. XD-RCDepth: Lightweight Radar-Camera Depth Estimation with Explainability-Aligned and Distribution-Aware Distillation, which presents a lightweight architecture that reduces parameters while maintaining comparable accuracy.
Advances in 3D Perception and Autonomous Driving
Sources
DAGLFNet:Deep Attention-Guided Global-Local Feature Fusion for Pseudo-Image Point Cloud Segmentation
CurriFlow: Curriculum-Guided Depth Fusion with Optical Flow-Based Temporal Alignment for 3D Semantic Scene Completion
Novel Class Discovery for Point Cloud Segmentation via Joint Learning of Causal Representation and Reasoning