The field of autonomous driving is witnessing significant advancements in 3D perception, driven by the need for accurate and reliable environment understanding. A key direction in this area is the development of innovative sensor fusion approaches, which combine data from various modalities such as LiDAR, cameras, and radar to enhance 3D occupancy prediction and object detection. Researchers are also exploring the use of foundation models and semantic segmentation to improve the accuracy and robustness of 3D perception systems. Noteworthy papers in this area include HeCoFuse, which proposes a unified framework for cooperative perception across mixed sensor setups, and SDGOCC, which introduces a novel multimodal occupancy prediction network that incorporates joint semantic and depth-guided view transformation. Additionally, GaussianFusionOcc presents a seamless sensor fusion approach using 3D Gaussians, demonstrating improved memory efficiency and inference speed. These advancements have the potential to significantly enhance the safety and functionality of autonomous vehicles.