Advances in Speech Separation and Spatial Audio

The field of speech separation and spatial audio is moving towards more realistic and diverse acoustic scenarios, with a focus on improving speech accessibility for children in noisy classrooms and advancing robust and efficient direction-of-arrival estimation. Researchers are exploring new architectures and training strategies to improve speech separation quality, such as spatially aware architectures and targeted adaptation. Additionally, there is a growing interest in using large language models to synthesize more realistic spatial audio scenes. Noteworthy papers include:

  • A study on speech separation for hearing-impaired children in the classroom, which demonstrated that spatially aware architectures combined with targeted adaptation can improve speech accessibility.
  • A paper on DOA estimation with a lightweight network on LLM-aided simulated acoustic scenes, which proposed a lightweight DOA estimation model that achieves satisfactory accuracy and robustness while maintaining low computational complexity.

Sources

Speech Separation for Hearing-Impaired Children in the Classroom

DOA Estimation with Lightweight Network on LLM-Aided Simulated Acoustic Scenes

A General Ziv-Zakai Bound for DoA Estimation in MIMO Radar Systems

Non-verbal Perception of Room Acoustics using Multi Dimensional Scaling Metho

Sound impact of simple viscoelastic damping changes due to aging and the role of the double bentside on soundboard tension in a 1755 Dulcken harpsichord

Spatial Audio Rendering for Real-Time Speech Translation in Virtual Meetings

Built with on top of