Mamba-Based Architectures in Vision Tasks

The field of vision tasks is witnessing a significant shift towards the adoption of Mamba-based architectures, which have shown great promise in modeling long-range dependencies and capturing complex contextual information. This trend is evident in various applications, including medical image detection, anomalous sound detection, light field super-resolution, and hyperspectral object tracking. The Mamba framework, with its ability to process sequences with linear complexity, is being explored and modified to suit specific task requirements, leading to state-of-the-art performance in several areas. Notably, the integration of Mamba with other models, such as Transformers, is enabling comprehensive information exploration across different domains. Noteworthy papers in this area include: SpectMamba, which achieves state-of-the-art performance in medical image detection by effectively mitigating frequency bias and capturing global context. ESTM, which improves anomalous sound detection performance by utilizing a dual-path Mamba architecture and selective state-space models. LFMT, which significantly outperforms current state-of-the-art methods in light field super-resolution by integrating the strengths of Mamba and Transformer models. HyMamba, which achieves state-of-the-art performance in hyperspectral object tracking by unifying spectral, cross-depth, and temporal modeling through state space modules. FSSM, which improves the performance of Mamba-based vision models by incorporating token correlations and modifying the calculation process of state space models.

Sources

SpectMamba: Integrating Frequency and State Space Models for Enhanced Medical Image Detection

ESTM: An Enhanced Dual-Branch Spectral-Temporal Mamba for Anomalous Sound Detection

Exploring Non-Local Spatial-Angular Correlations with a Hybrid Mamba-Transformer Framework for Light Field Super-Resolution

Hyperspectral Mamba for Hyperspectral Object Tracking

First-order State Space Model for Lightweight Image Super-resolution

Built with on top of