Efficient Neural Network Compression

The field of neural network compression is rapidly advancing, with a focus on developing innovative techniques to reduce model size and complexity while preserving accuracy. Recent developments have seen the integration of pruning and quantization techniques, resulting in significant compression benefits. Notably, the use of adaptive filtering and discrete Ricci curvature has shown promise in efficiently pruning neural networks. Furthermore, the application of multi-agent reinforcement learning and hybrid pruning frameworks has led to impressive reductions in model complexity.

Noteworthy papers include: Attention as an Adaptive Filter, which introduces a novel attention mechanism that incorporates a learnable dynamics model. Dynamic Sensitivity Filter Pruning using Multi-Agent Reinforcement Learning For DCNN's, which presents a novel single shot filter pruning framework that focuses on evaluating the stability and redundancy of filter importance scores. Compressing CNN models for resource-constrained systems by channel and layer pruning, which demonstrates a hybrid pruning approach that combines channel and layer pruning to reduce model complexity.

Sources

Attention as an Adaptive Filter

Integrating Pruning with Quantization for Efficient Deep Neural Networks Compression

Accuracy-Constrained CNN Pruning for Efficient and Reliable EEG-Based Seizure Detection

Application of discrete Ricci curvature in pruning randomly wired neural networks: A case study with chest x-ray classification of COVID-19

Dynamic Sensitivity Filter Pruning using Multi-Agent Reinforcement Learning For DCNN's

Compressing CNN models for resource-constrained systems by channel and layer pruning

Built with on top of