Advancements in Large Language Models for Speech and Dialogue Applications

The field of large language models (LLMs) is rapidly evolving, with a focus on improving their performance in speech and dialogue applications. Recent developments have highlighted the importance of adaptability, personalization, and multimodal interaction in these models. Researchers are exploring new approaches to role-playing dialogue agents, speech-based cognitive screening, and context-adaptive hearing aid fitting. The development of comprehensive benchmarks, such as TTA-Bench and VoxRole, is also underway to evaluate the performance of LLMs in these areas. Noteworthy papers include: Talk Less, Call Right, which presents a novel approach to prompting role-playing dialogue agents, and Who Gets Left Behind?, which audits disability inclusivity in LLMs. Additionally, LALM-Eval and AU-Harness provide efficient and comprehensive evaluation frameworks for large audio language models.

Sources

Talk Less, Call Right: Enhancing Role-Play LLM Agents with Automatic Prompt Optimization and Role Prompting

Who Gets Left Behind? Auditing Disability Inclusivity in Large Language Models

TTA-Bench: A Comprehensive Benchmark for Evaluating Text-to-Audio Models

Speech-Based Cognitive Screening: A Systematic Evaluation of LLM Adaptation Strategies

VoxRole: A Comprehensive Benchmark for Evaluating Speech-Based Role-Playing Agents

Towards Stable and Personalised Profiles for Lexical Alignment in Spoken Human-Agent Dialogue

Context-Adaptive Hearing Aid Fitting Advisor through Multi-turn Multimodal LLM Conversation

LALM-Eval: An Open-Source Toolkit for Holistic Evaluation of Large Audio Language Models

AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs