Advancements in Parallel Computing and Sparse Tensor Decomposition

The field of parallel computing is witnessing significant advancements, driven by the increasing demand for efficient processing of large-scale datasets. Researchers are exploring innovative approaches to accelerate computations, such as leveraging heterogeneous systems, developing domain-specific languages, and creating scalable indexing methods. A key focus area is the optimization of sparse tensor decomposition, which is crucial for various applications. Noteworthy papers in this area include: AMPED, which achieves a 5.1x geometric mean speedup in total execution time over state-of-the-art GPU baselines using 4 GPUs on a single CPU node. GALE, which achieves up to 2.7x speedup over state-of-the-art localized data structures while maintaining memory efficiency. AcceleratedKernels.jl, which enables productive parallel programming with minimized implementation and usage complexities, and achieves performance on par with C and OpenMP-multithreaded CPU implementations. Mapple, which reduces mapper code size by 14X and enables performance improvements of up to 1.34X over expert-written C++ mappers. PathWeaver, which achieves 3.24x geomean speedup and up to 5.30x speedup on 95% recall rate over state-of-the-art multi-GPU-based ANNS frameworks. SHINE, which proposes a scalable HNSW index for ANN search in disaggregated memory, reaching the same accuracy as a single-machine HNSW.

Sources

AMPED: Accelerating MTTKRP for Billion-Scale Sparse Tensor Decomposition on Multiple GPUs

GALE: Leveraging Heterogeneous Systems for Efficient Unstructured Mesh Data Analysis

AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms from a Unified, Transpiled Codebase

Mapple: A Domain-Specific Language for Mapping Distributed Heterogeneous Parallel Programs

PathWeaver: A High-Throughput Multi-GPU System for Graph-Based Approximate Nearest Neighbor Search

Multiprocessor Scheduling with Memory Constraints: Fundamental Properties and Finding Optimal Solutions

SHINE: A Scalable HNSW Index in Disaggregated Memory

Building an Accelerated OpenFOAM Proof-of-Concept Application using Modern C++

Built with on top of