Advances in Medical Vision-Language Models and Few-Shot Learning

The field of medical computer vision is witnessing significant advancements, particularly in the development of robust models for few-shot learning and medical vision-language models. Researchers are exploring innovative pre-training strategies and feature decoupling frameworks to improve model performance and reliability under distribution shifts. Noteworthy papers include:

A study on pre-training across domains for few-shot surgical skill assessment, which achieved accuracies of up to 73.65% in the 5-shot setting.
The introduction of DRiFt, a structured feature decoupling framework that improves in-distribution performance and robustness across unseen datasets.
The development of Density-Aware Farthest Point Sampling, a novel sampling method that reduces the mean absolute prediction error in regression models.
A method for efficient conformal prediction for regression models under label noise, which achieves performance close to the clean-label setting.
The introduction of CalibPrompt, a framework for calibrating medical vision-language models during prompt tuning, which improves calibration without drastically affecting clean accuracy.

Advances in Medical Vision-Language Models and Few-Shot Learning

Sources