The field of speech and posture analysis is moving towards the integration of large language models (LLMs) to enable fine-grained understanding and personalized feedback. This shift is driven by the need for more accurate and clinically interpretable systems that can provide real-time accessibility and feedback. Recent developments have demonstrated the potential of LLMs in healthcare, particularly in speech therapy and posture monitoring. Notable papers in this area include:
- SitLLM, which proposes a lightweight multimodal framework for sitting posture health understanding via pressure sensor data and LLMs.
- UTI-LLM, which presents a personalized articulatory-speech therapy assistance system based on multimodal LLMs.
- Deploying UDM Series in Real-Life Stuttered Speech Applications, which evaluates the clinical effectiveness of the Unconstrained Dysfluency Modeling series in real-world speech therapy applications.
- A long-form single-speaker real-time MRI speech dataset and benchmark, which releases a unique dataset for speech research and provides baseline performance for downstream tasks.