Advances in 3D Perception and Autonomous Navigation

The field of 3D perception and autonomous navigation is rapidly advancing, with a focus on developing more accurate and efficient methods for understanding and interacting with complex environments. Recent research has highlighted the importance of bridging the modality gap between 2D and 3D tasks, and has introduced novel frameworks and architectures for achieving this goal. Notable developments include the use of abstract bounding boxes to encode geometric structure and physical kinematics, and the integration of 2D semantic cues with 3D geometric reasoning. Additionally, there has been significant progress in the development of autonomous underwater cognitive systems, which enable adaptive navigation in complex oceanic conditions. Other areas of research have focused on improving the accuracy and reliability of inertial navigation systems, and on developing more effective methods for place recognition and localization. Overall, these advances have the potential to enable more sophisticated and autonomous systems, with applications in areas such as robotics, embodied intelligence, and environmental monitoring. Noteworthy papers include: Abstract 3D Perception for Spatial Intelligence in Vision-Language Models, which introduces a simple yet effective framework for improving spatial intelligence in vision-language models. Task-Aware 3D Affordance Segmentation via 2D Guidance and Geometric Refinement, which presents a novel geometry-optimized framework for understanding 3D scene-level affordances. Autonomous Underwater Cognitive System for Adaptive Navigation, which integrates SLAM with a cognitive architecture to enable adaptive navigation in complex underwater environments.

Sources

Abstract 3D Perception for Spatial Intelligence in Vision-Language Models

Task-Aware 3D Affordance Segmentation via 2D Guidance and Geometric Refinement

Autonomous Underwater Cognitive System for Adaptive Navigation: A SLAM-Integrated Cognitive Architecture

ResAlignNet: A Data-Driven Approach for INS/DVL Alignment

Orientation-Free Neural Network-Based Bias Estimation for Low-Cost Stationary Accelerometers

GaRLILEO: Gravity-aligned Radar-Leg-Inertial Enhanced Odometry

Going Places: Place Recognition in Artificial and Natural Systems

SweeperBot: Making 3D Browsing Accessible through View Analysis and Visual Question Answering

Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3D Constrained Terrains

MambaIO: Global-Coordinate Inertial Odometry for Pedestrians via Multi-Scale Frequency-Decoupled Modeling

LLaVA$^3$: Representing 3D Scenes like a Cubist Painter to Boost 3D Scene Understanding of VLMs

Built with on top of