Efficient Model Parameterization and Pruning in Deep Learning

The field of deep learning is moving towards more efficient model parameterization and pruning techniques. Researchers are exploring ways to reduce the number of parameters in models while maintaining their performance, with a focus on Vision Transformers and speech recognition models. Notable advancements include the development of frameworks for estimating the effective rank of models, which can lead to substantial parameter compression without significant loss in accuracy. Additionally, studies have shown that parameter reduction techniques, such as sharing and width reduction, can improve model performance and training stability. Noteworthy papers include: Estimating the Effective Rank of Vision Transformers via Low-Rank Factorization, which introduces a framework for estimating a model's intrinsic dimensionality. Parameter Reduction Improves Vision Transformers: A Comparative Study of Sharing and Width Reduction, which demonstrates that reducing parameters in Vision Transformers can lead to improved performance and training stability. GRASP: GRouped Activation Shared Parameterization for Parameter-Efficient Fine-Tuning and Robust Inference of Transformers, which proposes a lightweight fine-tuning framework that reduces the number of trainable parameters while preserving model performance.

Efficient Model Parameterization and Pruning in Deep Learning

Sources