Advancements in Agentic AI and Large Language Models

The field of artificial intelligence is rapidly evolving, with a growing focus on the development of agentic AI systems and large language models (LLMs). These systems are being applied in a variety of high-stakes domains, including social policymaking, construction project management, and cybersecurity. Recent research has highlighted the potential of LLMs to provide valuable insights and support human decision-making, but also raises important questions about their reliability, accountability, and potential for overreliance.

Notable papers in this area include: The paper on Adaptive Monitoring and Real-World Evaluation of Agentic AI Systems, which presents a novel algorithm for adaptive multi-dimensional monitoring and demonstrates its effectiveness in detecting anomalies and reducing false-positive rates. The paper on What Would an LLM Do, which evaluates the policymaking capabilities of LLMs and presents a promising potential to leverage them for social policy making.

Sources

Adaptive Monitoring and Real-World Evaluation of Agentic AI Systems

What Would an LLM Do? Evaluating Policymaking Capabilities of Large Language Models

The Ethical Compass of the Machine: Evaluating Large Language Models for Decision Support in Construction Project Management

LLMs in Cybersecurity: Friend or Foe in the Human Decision Loop?

The Law-Following AI Framework: Legal Foundations and Technical Constraints. Legal Analogues for AI Actorship and technical feasibility of Law Alignment

Measuring and mitigating overreliance is necessary for building human-compatible AI

HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants

Built with on top of