Multimodal Sentiment Analysis and Emotion Recognition

The field of multimodal sentiment analysis and emotion recognition is moving towards the development of more sophisticated and effective models that can capture complex cross-modal interactions and integrate diverse opinion modalities. Researchers are proposing novel frameworks and architectures that can adaptively integrate multi-level features, regulate cross-layer information flow, and achieve balanced representation learning. The use of geometric deep learning, dynamic fusion, and multi-level fusion methods is becoming increasingly popular. Additionally, the incorporation of supervisory documentation assistance and privileged information is being explored to enhance the extraction of text features and improve prediction performance.

Noteworthy papers include: The paper introducing RecruitView, a multimodal dataset for predicting personality and interview performance, which proposes a geometric deep learning framework that achieves superior performance while training fewer parameters. The paper proposing DyFuLM, a multimodal framework for sentiment analysis that introduces a hierarchical dynamic fusion module and a gated feature aggregation module, achieving state-of-the-art results on multi-task sentiment datasets. The paper introducing PSA-MF, a personality-sentiment aligned multi-level fusion framework that integrates sentiment-related information from different modalities and achieves state-of-the-art results on two commonly used datasets.

Sources

RecruitView: A Multimodal Dataset for Predicting Personality and Interview Performance for Human Resources Applications

Developing a Comprehensive Framework for Sentiment Analysis in Turkish

Sentiment Analysis and Emotion Classification using Machine Learning Techniques for Nagamese Language - A Low-resource Language

DyFuLM: An Advanced Multimodal Framework for Sentiment Analysis

PSA-MF: Personality-Sentiment Aligned Multi-Level Fusion for Multimodal Sentiment Analysis

Empathy Level Prediction in Multi-Modal Scenario with Supervisory Documentation Assistance

Multi-Modal Opinion Integration for Financial Sentiment Analysis using Cross-Modal Attention

Cross-Space Synergy: A Unified Framework for Multimodal Emotion Recognition in Conversation

Multi-Loss Learning for Speech Emotion Recognition with Energy-Adaptive Mixup and Frame-Level Attention

Built with on top of