Advances in Multimodal Data Modeling and Human Activity Recognition

The field of data modeling and human activity recognition is moving towards the development of more versatile and efficient architectures. Recent research has focused on creating models that can handle multimodal data, including categorical and numeric values, as well as time-series data from various sensors. One of the key directions is the application of transformer-based models to these complex data types, allowing for improved performance and generalizability. Another important trend is the use of knowledge distillation techniques to transfer knowledge from large, complex models to smaller, more efficient ones, enabling real-time and energy-efficient processing. Notable papers include:

  • multivariateGPT, which presents a single architecture for modeling sequences of mixed categorical and numeric data, demonstrating its ability to learn patterns in complex time series.
  • MultiFormer, which proposes a wireless sensing system for multi-person pose estimation based on Channel State Information (CSI) and attention mechanisms, achieving higher accuracy than state-of-the-art approaches.

Sources

multivariateGPT: a decoder-only transformer for multivariate categorical and numeric data

Self-supervised Learning Method Using Transformer for Multi-dimensional Sensor Data Processing

Improving Respiratory Sound Classification with Architecture-Agnostic Knowledge Distillation from Ensembles

MultiFormer: A Multi-Person Pose Estimation System Based on CSI and Attention Mechanism

Knowledge Distillation for Reservoir-based Classifier: Human Activity Recognition

Built with on top of