The field of video and image processing is moving towards more efficient and effective methods for handling large amounts of data. Recent research has focused on reducing redundancy in vision datasets and improving the compression of visual tokens. This has led to the development of innovative approaches such as dynamic-aware video distillation, multi-stage event-based token compression, and dynamic vision encoding. These methods have shown significant improvements in performance and efficiency, enabling faster and more accurate processing of video and image data. Notably, papers such as Dynamic-Aware Video Distillation and METok have proposed novel approaches to optimizing temporal resolution and compressing visual tokens, respectively. Additionally, papers like Images are Worth Variable Length of Representations and DynTok have introduced dynamic vision encoders and token compression strategies, respectively, which have achieved state-of-the-art results in various benchmarks. Overall, the field is advancing towards more efficient and effective methods for video and image processing, with a focus on reducing redundancy and improving compression.