Efficient Neural Network Quantization and Decision Tree Learning

The field of neural network quantization is rapidly advancing, with a focus on developing efficient and accurate methods for reducing the computational requirements of deep neural networks. Researchers are exploring innovative approaches, such as probabilistic frameworks and double binary factorization, to achieve improved performance and reduced memory overhead. Additionally, there is a growing interest in decision tree learning, with a emphasis on constructing small decision trees with few outliers and developing efficient algorithms for integer-only decision tree inference. Noteworthy papers in this area include:

  • A probabilistic framework for dynamic quantization, which achieves a negligible loss in performance while reducing computational overhead.
  • Double Binary Factorization, which preserves the efficiency advantages of binary representations while achieving competitive compression rates.
  • InTreeger, an end-to-end framework for integer-only decision tree inference, which enables the generation of highly optimized integer-only classification models.
  • A simple approximation algorithm for optimal decision tree, which provides a straightforward solution for a fundamental problem in machine learning.

Sources

A probabilistic framework for dynamic quantization

Addition is almost all you need: Compressing neural networks with double binary factorization

MID-L: Matrix-Interpolated Dropout Layer with Layer-wise Neuron Selection

End-to-end fully-binarized network design: from Generic Learned Thermometer to Block Pruning

Layer-wise Quantization for Quantized Optimistic Dual Averaging

Dual Precision Quantization for Efficient and Accurate Deep Neural Networks Inference

InTreeger: An End-to-End Framework for Integer-Only Decision Tree Inference

A Simple Approximation Algorithm for Optimal Decision Tree

Learning Small Decision Trees with Few Outliers: A Parameterized Perspective

Decision DNNFs with imbalanced conjunction cannot efficiently represent CNFs of bounded width

Built with on top of