Advances in Speech Enhancement and Recognition

The field of speech enhancement and recognition is moving towards more robust and efficient solutions, with a focus on addressing challenges in real-world scenarios. Recent developments have shown promising results in improving speech quality and intelligibility in various environments, including mobile and edge devices. The use of neural networks and advanced signal processing techniques has been instrumental in achieving these advancements. Notably, innovative approaches such as acoustic metamaterials and post-processing neural networks have been proposed to enhance voice assistant security and mitigate artifacts in speech enhancement models.

Some noteworthy papers in this area include: A Small-footprint Acoustic Echo Cancellation Solution for Mobile Full-Duplex Speech Interactions, which presents a neural network-based solution for effective acoustic echo cancellation in mobile scenarios. MetaGuardian: Enhancing Voice Assistant Security through Advanced Acoustic Metamaterials, which introduces a voice assistant protection system based on acoustic metamaterials to defend against inaudible and adversarial attacks.

Sources

A Small-footprint Acoustic Echo Cancellation Solution for Mobile Full-Duplex Speech Interactions

Audio-Visual Speech Enhancement: Architectural Design and Deployment Strategies

Revealing the Role of Audio Channels in ASR Performance Degradation

MetaGuardian: Enhancing Voice Assistant Security through Advanced Acoustic Metamaterials

Whisper Smarter, not Harder: Adversarial Attack on Partial Suppression

A dataset and model for recognition of audiologically relevant environments for hearing aids: AHEAD-DS and YAMNet+

Alternating Approach-Putt Models for Multi-Stage Speech Enhancement

Built with on top of