The field of Artificial Intelligence (AI) and High-Performance Computing (HPC) is rapidly evolving, with a focus on improving efficiency, performance, and scalability. Recent developments have centered around optimizing memory access, reducing data movement, and increasing computational power. Notably, innovations in processing-in-memory (PIM) architectures, heterogeneous systems, and advanced networking protocols are transforming the landscape. These advancements aim to address the growing demands of AI workloads, such as large language models and graph analytics, which require significant computational resources and memory bandwidth. Furthermore, research in reconfigurable architectures, 3D-stacked systems, and thermally-aware scheduling is pushing the boundaries of what is possible in AI and HPC. Overall, the field is moving towards more efficient, scalable, and sustainable solutions that can support the increasing complexity of AI and HPC applications. Noteworthy papers include VectorCDC, which accelerates data chunking in deduplication systems using vector instructions, and TLV-HGNN, which proposes a reconfigurable hardware accelerator for efficient HGNN inference. Additionally, THERMOS introduces a thermally-aware multi-objective scheduling framework for AI workloads on heterogeneous multi-chiplet PIM architectures.
Advancements in AI and HPC Systems
Sources
Maximizing GPU Efficiency via Optimal Adapter Caching: An Analytical Approach for Multi-Tenant LLM Serving
XDMA: A Distributed, Extensible DMA Architecture for Layout-Flexible Data Movements in Heterogeneous Multi-Accelerator SoCs
Architecting Long-Context LLM Acceleration with Packing-Prefetch Scheduler and Ultra-Large Capacity On-Chip Memories