The field of text-to-image generation and image editing is rapidly advancing, with a focus on improving the accuracy and compositional ability of models. Researchers are developing new benchmarks and evaluation frameworks to assess the performance of these models, such as CompAlign, which emphasizes 3D-spatial relationships, and CROC, which evaluates the robustness of metrics. The introduction of grounded evaluation frameworks, like GIE-Bench, is also a significant development, as it allows for more precise assessment of text-guided image editing models. Furthermore, studies are being conducted to understand the capabilities and limitations of generative AI in everyday image editing tasks, highlighting the need for improvements in areas such as preserving the identity of people and animals. Noteworthy papers include CompAlign, which proposes a complex benchmark and fine-grained feedback for improving compositional image generation, and CROC, which introduces a scalable framework for automated Contrastive Robustness Checks. Additionally, GIE-Bench provides a diagnostic benchmark for evaluating text-guided image editing models, and KRIS-Bench introduces a knowledge-based reasoning benchmark for intelligent image editing systems.