Advances in Conversational AI

The field of conversational AI is moving towards more natural and interactive dialogue systems. Recent developments focus on improving the ability of models to understand and respond to complex instructions, as well as to engage users in more meaningful conversations. Researchers are exploring new methods for controlling paralinguistic features in text-to-speech systems, such as using natural-language instructions to modulate vocal timbre and emotional state. Additionally, there is a growing interest in developing frameworks for analyzing and improving turn-taking dynamics in conversations, including the use of computational models to quantify talk-time sharing and predict speech activity. Noteworthy papers include: InstructTTSEval, which introduces a benchmark for measuring the capability of complex natural-language style control in text-to-speech systems. Prompt-Guided Turn-Taking Prediction, which proposes a novel model that enables turn-taking prediction to be dynamically controlled via textual prompts. Aligning Spoken Dialogue Models from User Interactions, which presents a novel preference alignment framework for improving spoken dialogue models from user interactions. Enhancing User Engagement in Socially-Driven Dialogue through Interactive LLM Alignments, which enables interactive LLMs to learn user engagement by leveraging signals from the future development of conversations.

Sources

InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems

Time is On My Side: Dynamics of Talk-Time Sharing in Video-chat Conversations

Prompt-Guided Turn-Taking Prediction

Aligning Spoken Dialogue Models from User Interactions

Enhancing User Engagement in Socially-Driven Dialogue through Interactive LLM Alignments

Built with on top of