The field of infrared target detection and multimodal fusion is witnessing significant advancements, driven by innovative approaches to noise suppression, feature enhancement, and modality alignment. Researchers are exploring new perspectives, such as frequency domain analysis and hypergraph-based temporal enhancement, to improve detection performance in complex environments. Noteworthy papers in this area include: NS-FPN, which proposes a noise-suppression feature pyramid network to reduce false alarms, MambaTrans, which introduces a multimodal fusion image modality translator to adapt fused images to downstream tasks, Skyshield, which presents an event-driven framework for submillimetre thin obstacle detection, COXNet, which develops a cross-layer fusion framework for RGBT tiny object detection, DOD-SA, which proposes a decoupled object detection framework with single-modality annotations, and HyperTea, which integrates global and local temporal perspectives to model high-order spatiotemporal correlations for moving infrared small target detection. These works demonstrate the potential for improved detection accuracy, reduced false alarms, and enhanced robustness in real-world applications.
Infrared Target Detection and Multimodal Fusion Advances
Sources
NS-FPN: Improving Infrared Small Target Detection and Segmentation from Noise Suppression Perspective
MambaTrans: Multimodal Fusion Image Translation via Large Language Model Priors for Downstream Visual Tasks