The field of transformer research is moving towards improving efficiency and effectiveness in tokenization and attention mechanisms. Recent developments have focused on reducing computational overhead while maintaining performance, with techniques such as token reduction, merging, and freezing being explored. Additionally, there is a growing interest in understanding the underlying principles of self-attention and its applications in various domains. Notable papers in this area include those that propose innovative methods for tokenization, such as the Latent Denoising Tokenizer, and those that introduce new attention mechanisms, such as DistrAttention. These advancements have the potential to significantly impact the field of natural language processing and computer vision.