The field of artificial intelligence is rapidly evolving, with a growing focus on the development of agentic AI systems and large language models (LLMs). These systems are being applied in a variety of high-stakes domains, including social policymaking, construction project management, and cybersecurity. Recent research has highlighted the potential of LLMs to provide valuable insights and support human decision-making, but also raises important questions about their reliability, accountability, and potential for overreliance.
Notable papers in this area include: The paper on Adaptive Monitoring and Real-World Evaluation of Agentic AI Systems, which presents a novel algorithm for adaptive multi-dimensional monitoring and demonstrates its effectiveness in detecting anomalies and reducing false-positive rates. The paper on What Would an LLM Do, which evaluates the policymaking capabilities of LLMs and presents a promising potential to leverage them for social policy making.