Dysarthric Speech Recognition and Personalized Speech Synthesis

The field of speech recognition and synthesis is moving towards more personalized and accessible solutions, particularly for individuals with dysarthric speech impairments. Recent developments have focused on improving the accuracy and intelligibility of speech recognition systems for dysarthric speakers, as well as creating more effective and realistic text-to-speech systems. Notable advancements include the use of synthetic speech generation, knowledge anchoring, and curriculum learning to enhance the performance of speech recognition and synthesis models. These innovations have the potential to significantly improve the communication abilities of individuals with speech impairments. Noteworthy papers include: Improved Dysarthric Speech to Text Conversion via TTS Personalization, which presents a method for generating synthetic dysarthric speech to fine-tune ASR models. Bridging ASR and LLMs for Dysarthric Speech Recognition, which benchmarks self-supervised ASR models and introduces LLM-based decoding to improve intelligibility. Facilitating Personalized TTS for Dysarthric Speakers Using Knowledge Anchoring and Curriculum Learning, which proposes a knowledge anchoring framework to generate synthetic speech with reduced articulation errors.

Dysarthric Speech Recognition and Personalized Speech Synthesis

Sources