The field of text-to-speech synthesis is moving towards more expressive and controllable speech generation, with a focus on addressing biases and improving overall quality. Researchers are exploring new methods for accent generation, linguistic adaptation, and style control, leading to more natural and intelligible speech. Additionally, there is a growing interest in detecting academic dishonesty assisted by large language models, with keystroke dynamics emerging as a promising approach. The use of prosodic segmentation and voiced-aware style extraction are also being investigated to improve speech synthesis. Notable papers in this area include:
- CLARITY, which presents a framework for addressing accent and linguistic biases in text-to-speech synthesis.
- MF-Speech, which achieves fine-grained control over speech factors through factor disentanglement.
- Detecting LLM-Assisted Academic Dishonesty using Keystroke Dynamics, which introduces a keystroke-dynamics-based detector for identifying AI-assisted plagiarism.