Interpretable Machine Learning Advances

The field of machine learning is moving towards increased interpretability, with a focus on developing methods that provide accurate and explainable predictions. Recent developments have centered around improving feature importance ranking, personalized prediction, and influence function computation. Notably, novel approaches have been proposed to efficiently compute influence functions, enabling their application to large-scale models. Additionally, advancements in feature attribution methods have led to the development of lightweight and architecture-aware techniques, suitable for real-time applications. These innovations have significant implications for high-stakes decision-making applications, such as healthcare and biometrics. Noteworthy papers include: RAMPART, which introduces a framework for ranking top-k features with high accuracy and provides theoretical guarantees. DeepACTIF, which presents a lightweight feature attribution method that leverages internal activations of sequence models to estimate feature importance efficiently. RMT-KD, which proposes a compression method that leverages Random Matrix Theory for knowledge distillation to reduce network size while maintaining stability and accuracy.

Sources

Top-$k$ Feature Importance Ranking

Personalized Prediction By Learning Halfspace Reference Classes Under Well-Behaved Distribution

Toward Efficient Influence Function: Dropout as a Compression Tool

RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation

DeepACTIF: Efficient Feature Attribution via Activation Traces in Neural Sequence Models

Discovering Association Rules in High-Dimensional Small Tabular Data

Built with on top of