The field of neural network quantization is rapidly advancing, with a focus on developing efficient and accurate methods for reducing the computational requirements of deep neural networks. Researchers are exploring innovative approaches, such as probabilistic frameworks and double binary factorization, to achieve improved performance and reduced memory overhead. Additionally, there is a growing interest in decision tree learning, with a emphasis on constructing small decision trees with few outliers and developing efficient algorithms for integer-only decision tree inference. Noteworthy papers in this area include:
- A probabilistic framework for dynamic quantization, which achieves a negligible loss in performance while reducing computational overhead.
- Double Binary Factorization, which preserves the efficiency advantages of binary representations while achieving competitive compression rates.
- InTreeger, an end-to-end framework for integer-only decision tree inference, which enables the generation of highly optimized integer-only classification models.
- A simple approximation algorithm for optimal decision tree, which provides a straightforward solution for a fundamental problem in machine learning.