Knowledge Distillation Advancements

The field of knowledge distillation is moving towards innovative methods that improve the transfer of knowledge from teacher models to student models. Recent developments focus on addressing limitations in existing distillation approaches, such as exposure bias and suboptimal generalization. Notably, researchers are exploring new perspectives on exploiting teacher knowledge, including relational inductive biases and uncertainty-aware distillation mechanisms. These advancements have the potential to enhance the performance and robustness of knowledge distillation in various applications, including image classification, disease grading, and edge devices. Noteworthy papers include:

Swapped Logit Distillation via Bi-level Teacher Alignment, which proposes a novel logit-based distillation method that outperforms state-of-the-art methods in image classification tasks.
Head-Tail-Aware KL Divergence in Knowledge Distillation for Spiking Neural Networks, which introduces a novel KD method that effectively aligns both head and tail regions of the distribution, leading to improved generalization.
Uncertainty-Aware Multi-Expert Knowledge Distillation for Imbalanced Disease Grading, which achieves state-of-the-art results in disease image grading by decoupling task-agnostic and task-specific features and using uncertainty-aware distillation.

Knowledge Distillation Advancements

Sources