Progress in Visual Localization and Mapping

The field of computer vision is rapidly advancing, with significant developments in visual localization and mapping. Researchers are exploring innovative approaches to improve the accuracy and robustness of these systems, enabling their deployment in a wide range of applications, from autonomous vehicles to spacecraft navigation. Notably, the use of large vision models and graph-based techniques is becoming increasingly popular, allowing for more efficient and effective feature extraction and matching. Additionally, the development of new datasets and benchmarks is facilitating the evaluation and comparison of different methods, driving progress in the field. Noteworthy papers include PanMatch, which presents a versatile foundation model for robust correspondence matching, and VISTA, a monocular segmentation-based mapping framework for appearance and view-invariant global localization. GeoDistill is also noteworthy for its geometry-guided weakly supervised self-distillation framework for cross-view localization.

Sources

Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework

PanMatch: Unleashing the Potential of Large Vision Models for Unified Matching Models

View Invariant Learning for Vision-Language Navigation in Continuous Environments

360-Degree Full-view Image Segmentation by Spherical Convolution compatible with Large-scale Planar Pre-trained Models

Domain Adaptation and Multi-view Attention for Learnable Landmark Tracking with Sparse Data

Glance-MCMT: A General MCMT Framework with Glance Initialization and Progressive Association

FPC-Net: Revisiting SuperPoint with Descriptor-Free Keypoint Detection via Feature Pyramids and Consistency-Based Implicit Matching

A New Dataset and Performance Benchmark for Real-time Spacecraft Segmentation in Onboard Flight Computers

GeoDistill: Geometry-Guided Self-Distillation for Weakly Supervised Cross-View Localization

GKNet: Graph-based Keypoints Network for Monocular Pose Estimation of Non-cooperative Spacecraft

Comparison of Localization Algorithms between Reduced-Scale and Real-Sized Vehicles Using Visual and Inertial Sensors

VISTA: Monocular Segmentation-Based Mapping for Appearance and View-Invariant Global Localization

CorrMoE: Mixture of Experts with De-stylization Learning for Cross-Scene and Cross-Domain Correspondence Pruning

Tree-SLAM: semantic object SLAM for efficient mapping of individual trees in orchards

UniLGL: Learning Uniform Place Recognition for FOV-limited/Panoramic LiDAR Global Localization

FFI-VTR: Lightweight and Robust Visual Teach and Repeat Navigation based on Feature Flow Indicator and Probabilistic Motion Planning

DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model

$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation