New Directions in Neural Architecture and Representation Learning

The field of neural architecture and representation learning is witnessing significant advancements, with a growing focus on efficient, scalable, and interpretable models. Recent developments suggest that traditional graph neural networks (GNNs) may not be necessary for certain tasks, as multi-layer perceptrons (MLPs) can effectively capture structural information. Furthermore, the connection between Transformers and GNNs is being explored, revealing that Transformers can be viewed as message-passing GNNs operating on fully connected graphs. Distributed neural architectures are also being introduced, allowing for flexible and efficient processing of input data. Additionally, there is a growing interest in understanding the geometry of neural network loss landscapes, with implications for generalization and optimization. Noteworthy papers in this area include 'Do We Really Need GNNs with Explicit Structural Modeling? MLPs Suffice for Language Model Representations', which challenges the necessity of GNNs for language model representations, and 'Transformers are Graph Neural Networks', which establishes a connection between Transformers and GNNs.

Sources

Do We Really Need GNNs with Explicit Structural Modeling? MLPs Suffice for Language Model Representations

Transformers are Graph Neural Networks

Towards Distributed Neural Architectures

Generalized Linear Mode Connectivity for Transformers

Geminet: Learning the Duality-based Iterative Process for Lightweight Traffic Engineering in Changing Topologies

Low-latency vision transformers via large-scale multi-head attention

Model Fusion via Neuron Interpolation

Real-Time In-Network Machine Learning on P4-Programmable FPGA SmartNICs with Fixed-Point Arithmetic and Taylor

Visual Anagrams Reveal Hidden Differences in Holistic Shape Processing Across Vision Models

NN-Former: Rethinking Graph Structure in Neural Architecture Representation

Fast AI Model Splitting over Edge Networks

Proof of a perfect platonic representation hypothesis

GradMetaNet: An Equivariant Architecture for Learning on Gradients

PERTINENCE: Input-based Opportunistic Neural Network Dynamic Execution

Towards Decentralized and Sustainable Foundation Model Training with the Edge