Advancements in Large Language Models for Social Media, Education, and Healthcare

The field of large language models (LLMs) is rapidly evolving, with significant advancements in social media, education, and healthcare. Recent studies have focused on developing LLMs that can simulate social media dynamics, generate personalized educational content, and improve patient-physician communication. Notably, the introduction of datasets such as BluePrint and PHORECAST has enabled the development of more realistic and engaging LLMs for social media and public health applications. In education, benchmarks like TutorBench and EduPersona have been established to evaluate the tutoring capabilities and subjective abilities of LLMs. Furthermore, research has explored the use of LLMs in healthcare, including the development of models that can infer patient-perceived physician traits and generate synthetic patient-tutor dialogues. Overall, these advancements demonstrate the potential of LLMs to drive innovation and improvement in various fields.

Noteworthy papers include BluePrint, which introduces a large-scale dataset for training and evaluating LLMs as social media agents, and PHORECAST, which presents a multimodal dataset for predicting individual-level behavioral responses and community-wide engagement patterns to health messaging. Additionally, TutorBench and EduPersona provide comprehensive benchmarks for assessing the tutoring capabilities and subjective abilities of LLMs, while TeachLM and Human Behavior Atlas offer novel approaches to fine-tuning LLMs for educational and psychological applications.

Sources

$\texttt{BluePrint}$: A Social Media User Dataset for LLM Persona Evaluation and Training

PHORECAST: Enabling AI Understanding of Public Health Outreach Across Populations

TutorBench: A Benchmark To Assess Tutoring Capabilities Of Large Language Models

When Patients Go to "Dr. Google" Before They Go to the Emergency Department

Lightweight Prompt Engineering for Cognitive Alignment in Educational AI: A OneClickQuiz Case Study

APIDA-Chat: Structured Synthesis of API Search Dialogues to Bootstrap Conversational Agents

Mapping Patient-Perceived Physician Traits from Nationwide Online Reviews with LLMs

Simulating and Understanding Deceptive Behaviors in Long-Horizon Interactions

EduPersona: Benchmarking Subjective Ability Boundaries of Virtual Student Agents

Impatient Users Confuse AI Agents: High-fidelity Simulations of Human Traits for Testing Agents

TeachLM: Post-Training LLMs for Education Using Authentic Learning Data

Human Behavior Atlas: Benchmarking Unified Psychological and Social Behavior Understanding

Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails

Instructional Goal-Aligned Question Generation for Student Evaluation in Virtual Lab Settings: How Closely Do LLMs Actually Align?

Aligning Large Language Models via Fully Self-Synthetic Data

PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch

FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline

VelLMes: A high-interaction AI-based deception framework

The Limits of Goal-Setting Theory in LLM-Driven Assessment