Distribution Alignment and Dataset Distillation

The field of dataset distillation is moving towards developing more effective methods for aligning distributions and preserving instance-level characteristics. Recent advances have focused on leveraging optimal transport and geometric-aware measures to improve the generalization of models trained on distilled datasets. Notably, several papers have proposed innovative approaches to distribution matching, trajectory-guided dataset distillation, and core distribution alignment, which have achieved state-of-the-art performance on various benchmarks. These developments have the potential to significantly reduce the storage and computational costs associated with large-scale datasets. Noteworthy papers include: Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation, which achieves a 4% accuracy improvement on ImageNet-1K. CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation establishes a new state-of-the-art accuracy of 60.4% on ImageNet-1K.

Sources

Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation

Boltzmann-Shannon Index: A Geometric-Aware Measure of Clustering Balance

TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution

Optimal Transportation and Alignment Between Gaussian Measures

CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation

Technical Report on Text Dataset Distillation

Rethinking Decoupled Knowledge Distillation: A Predictive Distribution Perspective

Built with on top of