Advances in Video Object Segmentation and Tracking

The field of video object segmentation and tracking is rapidly advancing, with a focus on improving the accuracy and efficiency of models. Recent developments have seen the integration of large language models and vision understanding, enabling more effective segmentation and tracking of objects in videos. The use of memory-augmented architectures and motion-guided cropping has also shown promising results, allowing for more accurate and efficient tracking of objects across frames. Notably, the development of training-free frameworks and the refinement of existing models have led to significant improvements in performance. Some noteworthy papers include: Enhancing Sa2VA for Referent Video Object Segmentation, which substantially improves Sa2VA's performance on the RVOS task, and Track-On2, which achieves state-of-the-art results in online point tracking through architectural refinements and improved synthetic training strategies. Additionally, MoCrop introduces a motion-aware adaptive cropping module for efficient video action recognition, and Sa2VA-i improves Sa2VA results with consistent training and inference. These advancements have the potential to impact various applications, including video editing, autonomous driving, and medical imaging.

Sources

Enhancing Sa2VA for Referent Video Object Segmentation: 2nd Solution for 7th LSVOS RVOS Track

Enriched Feature Representation and Motion Prediction Module for MOSEv2 Track of 7th LSVOS Challenge: 3rd Place Solution

MVP: Motion Vector Propagation for Zero-Shot Video Object Detection

MoCrop: Training Free Motion Guided Cropping for Efficient Video Action Recognition

Enhancing Video Object Segmentation in TrackRAD Using XMem Memory Network

3rd Place Report of LSVOS 2025 MeViS Track: Sa2VA-i: Improving Sa2VA Results with Consistent Training and Inference

Track-On2: Enhancing Online Point Tracking with Memory

The 1st Solution for MOSEv2 Challenge 2025: Long-term and Concept-aware Video Segmentation via SeC

Built with on top of