Advancements in Text-to-Speech Synthesis and Academic Integrity

The field of text-to-speech synthesis is moving towards more expressive and controllable speech generation, with a focus on addressing biases and improving overall quality. Researchers are exploring new methods for accent generation, linguistic adaptation, and style control, leading to more natural and intelligible speech. Additionally, there is a growing interest in detecting academic dishonesty assisted by large language models, with keystroke dynamics emerging as a promising approach. The use of prosodic segmentation and voiced-aware style extraction are also being investigated to improve speech synthesis. Notable papers in this area include:

  • CLARITY, which presents a framework for addressing accent and linguistic biases in text-to-speech synthesis.
  • MF-Speech, which achieves fine-grained control over speech factors through factor disentanglement.
  • Detecting LLM-Assisted Academic Dishonesty using Keystroke Dynamics, which introduces a keystroke-dynamics-based detector for identifying AI-assisted plagiarism.

Sources

CLARITY: Contextual Linguistic Adaptation and Accent Retrieval for Dual-Bias Mitigation in Text-to-Speech Generation

MF-Speech: Achieving Fine-Grained and Compositional Control in Speech Generation via Factor Disentanglement

Detecting LLM-Assisted Academic Dishonesty using Keystroke Dynamics

The Impact of Prosodic Segmentation on Speech Synthesis of Spontaneous Speech

Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech

Built with on top of