Advances in Neural Network Representations and Generalization

The field of neural networks is moving towards a deeper understanding of the internal mechanisms that enable these models to create meaningful representations and generalize well. Recent research has focused on characterizing the semantic content of hidden representations, revealing that the initial layers of deep neural networks generate a unimodal probability density that gets rid of any structure irrelevant to classification, while subsequent layers exhibit a hierarchical fashion that mirrors the semantic hierarchy of concepts. Additionally, the role of implicit bias in training dynamics has been shown to facilitate the emergence of neural collapse, a phenomenon where simple feature structures emerge at the last layer of a trained neural network. Noteworthy papers in this area include:

Neural Collapse under Gradient Flow on Shallow ReLU Networks for Orthogonally Separable Data, which advances prior results on neural collapse by relaxing the assumption of unconstrained features and revealing the role of implicit bias.
An unsupervised tour through the hidden pathways of deep neural networks, which introduces a method for estimating the intrinsic dimension of data and studies the evolution of probability density across hidden layers.
Generalization Bounds for Rank-sparse Neural Networks, which proves generalization bounds for neural networks that exploit the approximate low rank structure of weight matrices.
Scaling Non-Parametric Sampling with Representation, which proposes a simple non-parametric generative model that produces high-fidelity samples on MNIST and CIFAR-10 images.
On the Anisotropy of Score-Based Generative Models, which introduces the Score Anisotropy Directions (SADs) to reveal how different networks preferentially capture data structure.
On Measuring Localization of Shortcuts in Deep Networks, which investigates the layer-wise localization of shortcuts in deep models and finds that shortcut learning is distributed throughout the network.

Advances in Neural Network Representations and Generalization

Sources