Domain Generalization and Adaptation in Computer Vision

The field of computer vision is moving towards developing more robust and generalizable models that can perform well across different domains and datasets. Recent research has focused on addressing the challenges of domain shift and adaptation, with a particular emphasis on semantic segmentation, object detection, and image translation. Many of the proposed methods aim to reduce the reliance on large amounts of labeled data and instead leverage techniques such as style transfer, attention refocusing, and causal representation learning to improve model performance. Notable papers in this area include: Image Translation with Kernel Prediction Networks for Semantic Segmentation, which proposes a novel image translation method that guarantees semantic matching between synthetic and real images. Text-Driven Causal Representation Learning for Source-Free Domain Generalization, which integrates causal inference into the source-free domain generalization setting to achieve robust and domain-invariant features. Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks, which investigates the use of style transfer to reduce texture bias and improve robustness in semantic segmentation networks. Prototypical Progressive Alignment and Reweighting for Generalizable Semantic Segmentation, which proposes a novel framework that leverages class-wise prototypes and progressive alignment to achieve state-of-the-art performance in generalizable semantic segmentation. Dual form Complementary Masking for Domain-Adaptive Image Segmentation, which reframes masked reconstruction as a sparse signal reconstruction problem and proposes a simple yet effective UDA framework that integrates masked reconstruction directly into the main training pipeline. SS-DC: Spatial-Spectral Decoupling and Coupling Across Visible-Infrared Gap for Domain Adaptive Object Detection, which proposes a new framework based on a decoupling-coupling strategy to decouple domain-invariant and domain-specific features across multiple subdomains. Simulate, Refocus and Ensemble: An Attention-Refocusing Scheme for Domain Generalization, which proposes an attention-refocusing scheme that learns to reduce the domain shift by aligning the attention maps in CLIP via attention refocusing.

Domain Generalization and Adaptation in Computer Vision

Sources