Advancements in Hardware-Aware Neural Networks and Edge AI

The field of edge AI is moving towards increased efficiency and scalability with the development of hardware-aware neural networks and novel number formats. Researchers are focusing on designing models and architectures that can optimize performance, power consumption, and area usage. Notable advancements include the proposal of a variable-point number format for efficient multiplication of high-dynamic-range numbers and the development of hardware-aware neural architecture search frameworks for early exiting networks on edge accelerators. Furthermore, the creation of NPU-native vision-language models and the design of near-memory architectures for event-based computer vision are pushing the boundaries of edge AI applications. These innovations have the potential to enable real-time and low-latency processing on resource-constrained devices, making them suitable for deployment in various edge environments. Particularly noteworthy papers include the proposal of the HDAP framework for hardware-aware DNN compression and the development of the AutoNeural architecture for NPU-native vision-language models.

Sources

VeriPy - A New Python-Based Approach for SDR Pipelined/Unrolled Hardware Accelerator Generation

Hardware-Aware DNN Compression for Homogeneous Edge Devices

Hardware-Aware Neural Network Compilation with Learned Optimization: A RISC-V Accelerator Approach

A Configurable Mixed-Precision Fused Dot Product Unit for GPGPU Tensor Computation

From RISC-V Cores to Neuromorphic Arrays: A Tutorial on Building Scalable Digital Neuromorphic Processors

Ternary-Input Binary-Weight CNN Accelerator Design for Miniature Object Classification System with Query-Driven Spatial DVS

Variable Point: A Number Format for Area- and Energy-Efficient Multiplication of High-Dynamic-Range Numbers

hls4ml: A Flexible, Open-Source Platform for Deep Learning Acceleration on Reconfigurable Hardware

Intrusion Detection on Resource-Constrained IoT Devices with Hardware-Aware ML and DL

Adversarial Robustness of Traffic Classification under Resource Constraints: Input Structure Matters

Model Recovery at the Edge under Resource Constraints for Physical AI

Near-Memory Architecture for Threshold-Ordinal Surface-Based Corner Detection of Event Cameras

AutoNeural: Co-Designing Vision-Language Models for NPU Inference

The BrainScaleS-2 multi-chip system: Interconnecting continuous-time neuromorphic compute substrates

Hardware-aware Neural Architecture Search of Early Exiting Networks on Edge Accelerators