Advancements in Deep Neural Network Acceleration and Optimization

The field of deep neural network (DNN) acceleration and optimization is rapidly evolving, with a focus on improving performance, reducing power consumption, and increasing efficiency. Recent developments have centered around innovative dataflow optimization techniques, novel programming models, and hardware-aware mapping frameworks. These advancements aim to address the challenges posed by emerging DNN models, such as large language models and state space models, which require optimized data movement, memory hierarchy, and compute throughput. Notable papers in this area include: Bit Transition Reduction by Data Transmission Ordering in NoC-based DNN Accelerator, which proposes a 1-bit count-based ordering method to reduce bit transitions and achieve up to 32.01% link power reduction. COMET: A Framework for Modeling Compound Operation Dataflows with Explicit Collectives, which introduces a novel representation for modeling collective communication and achieves up to 3.46x speedup for certain operations. Dato: A Task-Based Programming Model for Dataflow Accelerators, which elevates data communication and sharding to first-class type constructs and achieves up to 84% hardware utilization for certain tasks.

Sources

Bit Transition Reduction by Data Transmission Ordering in NoC-based DNN Accelerator

COMET: A Framework for Modeling Compound Operation Dataflows with Explicit Collectives

HiCR, an Abstract Model for Distributed Heterogeneous Programming

Hardware-Aware Data and Instruction Mapping for AI Tasks: Balancing Parallelism, I/O and Memory Tradeoffs

Dynamic reconfiguration for malleable applications using RMA

High Utilization Energy-Aware Real-Time Inference Deep Convolutional Neural Network Accelerator

Dato: A Task-Based Programming Model for Dataflow Accelerators

Built with on top of