Advances in Multimodal Emotion Recognition and Facial Expression Analysis

The field of emotion recognition and facial expression analysis is witnessing significant advancements with the development of innovative models and techniques. Recent research has focused on improving the accuracy and robustness of multimodal sentiment analysis and facial expression recognition (FER) systems. The use of transformer-based models and attention mechanisms has shown great promise in capturing cross-modal interactions and enhancing the performance of these systems. Additionally, the incorporation of texture key driver factors and adaptive cross-modal fusion techniques has led to state-of-the-art results in FER. Noteworthy papers include:

  • A paper that proposes a new set of attention mechanisms combining Triplet attention with Squeeze-and-Excitation, achieving state-of-the-art results on the FER2013 dataset.
  • A paper that introduces a Transformer-based Adaptive Cross-modal Fusion Network (TACFN) for multimodal emotion recognition, demonstrating significant performance improvement compared to other methods.
  • A paper that presents a novel framework focusing on Texture Key Driver Factors (TKDF) for FER, achieving state-of-the-art performance on the RAF-DB and KDEF datasets.

Sources

Achieving 3D Attention via Triplet Squeeze and Excitation Block

Multimodal Sentiment Analysis on CMU-MOSEI Dataset using Transformer-based Models

TACFN: Transformer-based Adaptive Cross-modal Fusion Network for Multimodal Emotion Recognition

TKFNet: Learning Texture Key Factor Driven Feature for Facial Expression Recognition

Built with on top of