Advances in Large Language Model-Based Decision-Making and Agentic AI

The field of artificial intelligence is witnessing significant advancements in the development of Large Language Model (LLM)-based decision-making and agentic AI systems. Researchers are exploring innovative approaches to enhance the performance and reliability of these systems, including the use of reinforcement learning, cognitive architectures, and hybrid strategies. A key direction in this field is the integration of cognitive mechanisms and internal state awareness to improve the consistency and contextual alignment of LLM-based role-playing agents. Another important area of research is the development of frameworks and methods for measuring and evaluating the identity and trustworthiness of language model agents. Noteworthy papers in this area include:

  • QSAF, which introduces a novel mitigation framework for cognitive degradation in agentic AI systems,
  • HAMLET, which proposes a hyperadaptive agent-based modeling framework for live embodied theatrics,
  • Test-Time-Matching, which enables training-free role-playing through test-time scaling and context engineering,
  • CogDual, which enhances dual cognition of LLMs via reinforcement learning with implicit rule-based rewards,
  • Agent Identity Evals, which provides a rigorous framework for measuring agentic identity, and
  • Shop-R1, which rewards LLMs to simulate human behavior in online shopping via reinforcement learning.

Sources

Feedback-Induced Performance Decline in LLM-Based Decision-Making

QSAF: A Novel Mitigation Framework for Cognitive Degradation in Agentic AI

HAMLET: Hyperadaptive Agent-based Modeling for Live Embodied Theatrics

Test-Time-Matching: Decouple Personality, Memory, and Linguistic Style in LLM-based Role-Playing Language Agent

CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards

Agent Identity Evals: Measuring Agentic Identity

Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning

Built with on top of