Advancements in Few-Shot Learning and Vision-Language Models

The field of few-shot learning and vision-language models is moving towards more effective and efficient methods for adapting to new tasks and datasets. Researchers are exploring innovative approaches to mitigate the challenges of limited labeled data, such as instance-level mismatches and class-level imprecision. Notably, the development of prototype-guided curriculum learning frameworks and stochastic-based patch filtering methods are showing promising results in improving few-shot learning performance. Additionally, the adaptation of vision-language models to specific downstream applications without requiring manual labeling is becoming increasingly important. Some noteworthy papers in this regard include: Prototype-Guided Curriculum Learning for Zero-Shot Learning, which proposes a framework to mitigate instance-level mismatches and class-level imprecision. Effortless Vision-Language Model Specialization in Histopathology without Annotation, which investigates annotation-free adaptation of vision-language models through continued pretraining on domain- and task-relevant image-caption pairs. MOC: Meta-Optimized Classifier for Few-Shot Whole Slide Image Classification, which proposes a meta-optimized classifier that automatically optimizes a classifier configuration from a mixture of candidate classifiers.

Sources

Prototype-Guided Curriculum Learning for Zero-Shot Learning

Effortless Vision-Language Model Specialization in Histopathology without Annotation

Calibration Attention: Instance-wise Temperature Scaling for Vision Transformers

Slot Attention-based Feature Filtering for Few-Shot Learning

MOC: Meta-Optimized Classifier for Few-Shot Whole Slide Image Classification

Stochastic-based Patch Filtering for Few-Shot Learning

Built with on top of