The field of neural network compression and optimization is rapidly advancing, with a focus on developing innovative methods to reduce memory and computational costs while maintaining model accuracy. Recent developments have centered around low-rank factorization, sparse dictionary learning, and dynamic rank allocation, which have shown promising results in compressing large language models and convolutional neural networks. Noteworthy papers include LANCE, which proposes a framework for efficient on-device continual learning, and CoSpaDi, which introduces a novel compression framework using sparse dictionary learning. Additionally, papers such as BALF and D-Rank have demonstrated the effectiveness of budgeted rank allocation and dynamic rank allocation in compressing models without fine-tuning. These advancements have the potential to enable efficient deployment of neural networks on edge devices and improve their performance in resource-constrained environments.