Multilingual Speech Recognition and Text Analysis

The field of speech recognition and text analysis is witnessing significant advancements, driven by the integration of large language models and automatic speech recognition systems. A key trend is the development of innovative training paradigms, such as iterative training methods, to address the overfitting issue in Low-Rank Adaptation (LoRA) and improve model performance. Additionally, researchers are exploring the use of generative error correction and multi-modal approaches to enhance the accuracy of transcription predictions, particularly in scenarios with accented speech. Another area of focus is the application of LoRA to text-based tasks, such as sexism detection in multilingual settings, where conditional adapter routing and hierarchical structuring of subtasks are being leveraged to achieve strong performance. Noteworthy papers include: ILT-Iterative LoRA Training through Focus-Feedback-Fix for Multilingual Speech Recognition, which proposes an innovative training paradigm and achieves strong results in the Interspeech 2025 Multilingual Conversational Speech Language Modeling Challenge. Mixture of LoRA Experts with Multi-Modal and Multi-Granularity LLM Generative Error Correction for Accented Speech Recognition, which demonstrates a remarkable relative WER reduction of 67.35% compared to the Whisper-large-v3 baseline. Mario at EXIST 2025: A Simple Gateway to Effective Multilingual Sexism Detection, which achieves competitive performance across all subtasks with minimal preprocessing and reduced training time.

Multilingual Speech Recognition and Text Analysis

Sources