State-Space Models and Automata Learning

The field of state-space models and automata learning is experiencing significant advancements, with a focus on improving expressivity, efficiency, and scalability. Recent developments have led to the proposal of novel architectures, such as hybrid models combining state-space models with advanced attention mechanisms, which have shown promising results in tasks like electronic health record representation learning. Additionally, researchers have made progress in understanding the theoretical foundations of state-space models, including their learning dynamics and optimization strategies. Notably, the Mamba model has been extensively studied, with analyses of its in-context learning capabilities, robustness to outliers, and training dynamics. Furthermore, innovative approaches to automata learning, such as passive learning of lattice automata and efficient decomposition identification of deterministic finite automata, have been explored.

Noteworthy papers include: Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models, which proposes a novel parametrization of transition matrices enabling efficient computation and optimal state size and depth. HyMaTE: A Hybrid Mamba and Transformer Model for EHR Representation Learning achieves state-of-the-art results in electronic health record representation learning by combining the strengths of state-space models with advanced attention mechanisms.

Sources

Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models

Passive Learning of Lattice Automata from Recurrent Neural Networks

Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression

HyMaTE: A Hybrid Mamba and Transformer Model for EHR Representation Learning

Efficient Decomposition Identification of Deterministic Finite Automata from Examples

Can Mamba Learn In Context with Outliers? A Theoretical Generalization Analysis

Memory Determines Learning Direction: A Theory of Gradient-Based Optimization in State Space Models

Built with on top of