The field of large language models is moving towards more efficient quantization techniques to reduce computational resources and memory footprint. Recent developments have focused on improving post-training quantization methods, such as wavelet-enhanced high-fidelity 1-bit quantization and adaptive transforms for joint weight-activation quantization. These methods have shown significant improvements in quantization fidelity and reduced performance degradation. Additionally, researchers have explored the use of intrinsic structure as a proxy for saliency in mixed-precision quantization, low-rank prehab for preparing neural networks for SVD compression, and phase-aware quantization schemes for complex-valued models. Noteworthy papers include HBLLM, which introduces a wavelet-enhanced high-fidelity 1-bit post-training quantization method, and FAIRY2I, which presents a universal framework for transforming pre-trained real-valued layers into an equivalent widely-linear complex form for extremely low-bit quantization. Overall, these advancements have the potential to enable more efficient deployment of large language models on commodity hardware.