Progress in Test-Time Adaptation for Vision-Language Models

The field of vision-language models is advancing rapidly, with a focus on improving performance under distribution shifts and real-world scenarios. Researchers are exploring innovative methods for test-time adaptation, including continual-temporal test-time adaptation, risk monitoring, and calibrated foundation models. These approaches aim to enhance the reliability and robustness of vision-language models in various applications, such as image classification and medical image tasks. Noteworthy papers in this area include BayesTTA, which proposes a Bayesian adaptation framework for continual-temporal test-time adaptation, and StaRFM, which introduces a unified framework for calibrated and robust foundation models. GS-Bias is another notable work, presenting an efficient and effective test-time adaptation paradigm that incorporates global and spatial biases. Overall, these developments are pushing the boundaries of vision-language models and enabling more accurate and reliable performance in real-world scenarios. Notable papers: BayesTTA consistently outperforms state-of-the-art methods in continual-temporal test-time adaptation. StaRFM shows consistent performance gains in vision-language and medical image tasks, with improved calibration and robustness. GS-Bias achieves state-of-the-art performance on 15 benchmark datasets while requiring minimal computational resources.

Sources

BayesTTA: Continual-Temporal Test-Time Adaptation for Vision-Language Models via Gaussian Discriminant Analysis

Monitoring Risks in Test-Time Adaptation

Calibrated and Robust Foundation Models for Vision-Language and Medical Image Tasks Under Distribution Shift

Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations

Enhancing Cross Entropy with a Linearly Adaptive Loss Function for Optimized Classification Performance

GS-Bias: Global-Spatial Bias Learner for Single-Image Test-Time Adaptation of Vision-Language Models

LanePerf: a Performance Estimation Framework for Lane Detection

Built with on top of