Advances in Speech and Posture Analysis using Large Language Models

The field of speech and posture analysis is moving towards the integration of large language models (LLMs) to enable fine-grained understanding and personalized feedback. This shift is driven by the need for more accurate and clinically interpretable systems that can provide real-time accessibility and feedback. Recent developments have demonstrated the potential of LLMs in healthcare, particularly in speech therapy and posture monitoring. Notable papers in this area include:

  • SitLLM, which proposes a lightweight multimodal framework for sitting posture health understanding via pressure sensor data and LLMs.
  • UTI-LLM, which presents a personalized articulatory-speech therapy assistance system based on multimodal LLMs.
  • Deploying UDM Series in Real-Life Stuttered Speech Applications, which evaluates the clinical effectiveness of the Unconstrained Dysfluency Modeling series in real-world speech therapy applications.
  • A long-form single-speaker real-time MRI speech dataset and benchmark, which releases a unique dataset for speech research and provides baseline performance for downstream tasks.

Sources

SitLLM: Large Language Models for Sitting Posture Health Understanding via Pressure Sensor Data

UTI-LLM: A Personalized Articulatory-Speech Therapy Assistance System Based on Multimodal Large Language Model

Deploying UDM Series in Real-Life Stuttered Speech Applications: A Clinical Evaluation Framework

A long-form single-speaker real-time MRI speech dataset and benchmark

Built with on top of