The field of agentic systems and large language models is rapidly evolving, with a focus on improving the robustness, personalization, and security of these systems. Researchers are exploring new approaches to benchmarking and evaluating the performance of agentic systems, including the development of novel taxonomies and benchmarks. There is also a growing emphasis on improving the safety and reliability of these systems, with a focus on detecting and preventing harmful behaviors. Additionally, researchers are investigating the use of large language models in a variety of applications, including science and high-performance computing, virtual reality, and mixed reality. Notable papers in this area include: Benchmarking the Robustness of Agentic Systems to Adversarially-Induced Harms, which proposes a novel benchmark for evaluating the security of agentic systems. PerPilot: Personalizing VLM-based Mobile Agents via Memory and Exploration, which introduces a framework for personalizing mobile agents using large language models. Reliable Weak-to-Strong Monitoring of LLM Agents, which presents a systematized monitor red teaming workflow for detecting covert misbehavior in autonomous LLM agents. Aegis: Taxonomy and Optimizations for Overcoming Agent-Environment Failures in LLM Agents, which proposes a taxonomy for agent-environment interaction failures and designs targeted environment optimizations to improve agent success rates.