The field of Large Language Model (LLM) agents is moving towards a more utility-driven perspective, focusing on evaluating agents through their overall return on investment (ROI) rather than solely optimizing model performance. This shift is driven by the need to make LLM agents more scalable, accessible, and effective in real-world contexts. Researchers are exploring innovative approaches to improve the usability of LLM agents, including the use of anthropomorphized language agents for user experience studies and the development of frameworks for building production-grade conversational agents. Notable papers in this area include:
- A paper proposing a framework for evaluating agents through the lens of Agentic ROI, highlighting the importance of considering information quality, agent time, and cost in optimizing agent performance.
- A paper presenting Agentic H-CI, a framework for crowdsourcing agents for scalable user studies, which demonstrated the potential for budget-friendly and insightful user experiences at scale.
- A paper introducing OSS-UAgent, an automated and configurable agent-based usability evaluation framework for open source software, which employs intelligent agents powered by LLMs to simulate developers performing programming tasks.