The field of artificial intelligence is moving towards the development of more autonomous and self-training systems. Recent research has focused on creating multi-agent platforms that combine user agents, cognitive agents, and experiment managers to integrate problem specification, experiment planning, and execution into end-to-end systems. These systems have shown strong performance and efficiency in various benchmarks, including regression, NLP, computer vision, and drug discovery. Additionally, there is a growing interest in evaluating the trustworthiness and reliability of agentic AI systems, with a focus on transparent and verifiable evaluation frameworks. Other notable advancements include the development of tool-augmented planning for ML tasks, process-centric analysis of agentic software systems, and cost-reduction methods for LLM agent inference. Noteworthy papers in this area include: SelfAI, which proposes a general multi-agent platform for autonomous scientific discovery. ML-Tool-Bench, which introduces a comprehensive benchmark for evaluating tool-augmented ML agents. DrawingBench, which presents a verification framework for evaluating the trustworthiness of agentic LLMs through spatial reasoning tasks. In-Context Distillation with Self-Consistency Cascades, which proposes a simple method for reducing LLM agent inference costs without incurring training costs. EnCompass, which introduces a new approach to agent programming that disentangles core workflow logic and inference-time strategy.