The field of computer architecture is witnessing a significant shift towards chiplet-based systems and processing-in-memory (PIM) architectures. These innovative approaches aim to address the memory bandwidth wall and improve the performance of memory-intensive workloads. Researchers are exploring various techniques to enable the construction of larger-scale VLSI systems with higher energy efficiency in data movement. Notably, the use of 2.5D/3D heterogeneous integration and the development of chiplet-based memory modules are gaining traction. Furthermore, PIM architectures are being investigated to reduce data movement and improve performance. The integration of advanced processing components, such as systolic arrays and SRAM-based buffers, into PIM architectures is also being explored. Additionally, researchers are working on improving the utility of CPU pins to alleviate memory bandwidth constraints. Overall, these advancements have the potential to significantly improve the performance and efficiency of various workloads, including large language models and vision transformers. Noteworthy papers in this area include Sangam, which presents a chiplet-based memory module that achieves significant speedup and energy savings for large language model inference, and DCC, a data-centric ML compiler for PIM systems that jointly co-optimizes data rearrangements and compute code, achieving up to 7.68x speedup on HBM-PIM and up to 13.17x speedup on AttAcc PIM backend over GPU-only execution.