The field of deep neural network (DNN) acceleration and optimization is rapidly evolving, with a focus on improving performance, reducing power consumption, and increasing efficiency. Recent developments have centered around innovative dataflow optimization techniques, novel programming models, and hardware-aware mapping frameworks. These advancements aim to address the challenges posed by emerging DNN models, such as large language models and state space models, which require optimized data movement, memory hierarchy, and compute throughput. Notable papers in this area include: Bit Transition Reduction by Data Transmission Ordering in NoC-based DNN Accelerator, which proposes a 1-bit count-based ordering method to reduce bit transitions and achieve up to 32.01% link power reduction. COMET: A Framework for Modeling Compound Operation Dataflows with Explicit Collectives, which introduces a novel representation for modeling collective communication and achieves up to 3.46x speedup for certain operations. Dato: A Task-Based Programming Model for Dataflow Accelerators, which elevates data communication and sharding to first-class type constructs and achieves up to 84% hardware utilization for certain tasks.