The field of image generation and editing is witnessing significant advancements, with a focus on improving the quality and control of generated images. Researchers are exploring new methods to incorporate knowledge and semantics into image generation models, enabling them to capture complex dependencies and relationships between visual elements. Another notable trend is the development of frameworks that can effectively utilize masked or discarded regions in images, leveraging them as valuable sources of information to enhance feature learning and preservation of fine-grained details. Furthermore, there is a growing interest in providing fine-grained control over photographic elements in video editing, allowing for more sophisticated and aesthetically pleasing visual effects. Noteworthy papers in this area include Improved Masked Image Generation with Knowledge-Augmented Token Representations, which introduces a novel framework for incorporating token-level semantic dependencies into image generation models. MaskAnyNet is also notable for its approach to treating masked content as auxiliary knowledge, rather than ignoring it, to improve feature learning and semantic diversity. Additionally, UniSER stands out for its unified approach to removing soft effects from images, such as lens flare and haze, using a single versatile model. BokehFlow is also worth mentioning for its ability to render controllable bokeh effects without requiring depth inputs, using a flow matching approach.