Structured Pruning of Deep Neural Networks

The field of deep neural networks is moving towards more efficient and effective model compression techniques, with a focus on structured pruning methods that preserve model performance while reducing computational cost. Researchers are exploring novel regularizers and importance metrics that challenge traditional magnitude-biased pruning decisions, aiming to provide fair pruning chances for each filter and achieve robust pruning behavior. Theoretical frameworks are being developed to support these methods, leading to empirical validations and superior pruning results on various datasets and models. Noteworthy papers include: Catalyst, which proposes a novel regularizer for structured pruning with auxiliary extension of parameter space, and IPPRO, which introduces a projective offset for magnitude-indifferent structural pruning, both of which demonstrate near-lossless pruning and promising performance after fine-tuning.

Sources

Catalyst: a Novel Regularizer for Structured Pruning with Auxiliary Extension of Parameter Space

IPPRO: Importance-based Pruning with PRojective Offset for Magnitude-indifferent Structural Pruning

Application-Specific Component-Aware Structured Pruning of Deep Neural Networks via Soft Coefficient Optimization

Efficient Column-Wise N:M Pruning on RISC-V CPU

Built with on top of