Knowledge Distillation Advancements

The field of knowledge distillation is moving towards innovative methods that improve the transfer of knowledge from teacher models to student models. Recent developments focus on addressing limitations in existing distillation approaches, such as exposure bias and suboptimal generalization. Notably, researchers are exploring new perspectives on exploiting teacher knowledge, including relational inductive biases and uncertainty-aware distillation mechanisms. These advancements have the potential to enhance the performance and robustness of knowledge distillation in various applications, including image classification, disease grading, and edge devices. Noteworthy papers include:

  • Swapped Logit Distillation via Bi-level Teacher Alignment, which proposes a novel logit-based distillation method that outperforms state-of-the-art methods in image classification tasks.
  • Head-Tail-Aware KL Divergence in Knowledge Distillation for Spiking Neural Networks, which introduces a novel KD method that effectively aligns both head and tail regions of the distribution, leading to improved generalization.
  • Uncertainty-Aware Multi-Expert Knowledge Distillation for Imbalanced Disease Grading, which achieves state-of-the-art results in disease image grading by decoupling task-agnostic and task-specific features and using uncertainty-aware distillation.

Sources

Swapped Logit Distillation via Bi-level Teacher Alignment

Head-Tail-Aware KL Divergence in Knowledge Distillation for Spiking Neural Networks

Group Relative Knowledge Distillation: Learning from Teacher's Relational Inductive Bias

A Brief Review for Compression and Transfer Learning Techniques in DeepFake Detection

CAE-DFKD: Bridging the Transferability Gap in Data-Free Knowledge Distillation

Uncertainty-Aware Multi-Expert Knowledge Distillation for Imbalanced Disease Grading

Built with on top of