Advances in Neural Network Interpretability and Stability

The field of neural networks is moving towards a better understanding of the underlying mechanisms and behaviors of these complex systems. Recent research has focused on improving the interpretability of neural networks, with a particular emphasis on explaining the decisions made by these models. Techniques such as abstraction and refinement are being developed to provide provably sufficient explanations of neural network predictions, while also improving the efficiency of the verification process. Additionally, there is a growing interest in understanding the stability of neural networks, including the study of phenomena such as neural collapse and the stability of the Jacobian matrix. Noteworthy papers in this area include those that propose novel frameworks for evaluating neuron explanations and those that investigate the interior-point vanishing problem in semidefinite relaxations for neural network verification. For example, one paper introduces a unified framework for evaluating neuron explanations, allowing for a comparison of existing evaluation metrics and the proposal of reliable evaluation guidelines. Another paper addresses the interior-point vanishing problem, providing practical solutions to improve the applicability of semidefinite programming-based verification to deeper neural networks.

Sources

Grokking Beyond the Euclidean Norm of Model Parameters

Evaluating Neuron Explanations: A Unified Framework with Sanity Checks

Neural Collapse in Cumulative Link Models for Ordinal Regression: An Analysis with Unconstrained Feature Model

Explaining, Fast and Slow: Abstraction and Refinement of Provable Explanations

On the Stability of the Jacobian Matrix in Deep Neural Networks

Did I Faithfully Say What I Thought? Bridging the Gap Between Neural Activity and Self-Explanations in Large Language Models

Abstraction-Based Proof Production in Formal Verification of Neural Networks

Interior-Point Vanishing Problem in Semidefinite Relaxations for Neural Network Verification

Built with on top of