The field of computer vision is rapidly advancing, with a focus on developing robust and accurate models for autonomous systems and urban environments. Recent research has emphasized the importance of creating comprehensive datasets that capture the complexity of real-world scenarios, such as dense and dynamic urban environments, and the integration of multimodal data sources, including cameras, LiDAR, and radar. Notable papers in this area include: The ODOR dataset, which provides a large-scale collection of object annotations for artworks and challenges researchers to explore the intersection of object recognition and smell perception. The RoundaboutHD dataset, which offers a comprehensive benchmark for multi-camera vehicle tracking in real-world urban environments. The EGC-VMAP framework, which generates accurate city-scale vectorized maps through crowdsourced vehicle data. The TruckV2X dataset, which addresses the unique perception challenges of autonomous trucking and provides a foundation for developing cooperative perception systems.