The field of computer vision and graphics is witnessing significant developments in image and model representation. Researchers are exploring novel approaches to decompose and represent complex visual data, such as images and 3D models, in a more efficient and meaningful way. This includes the use of diffusion models, transformer-based architectures, and line-based representations to improve the accuracy and flexibility of image and model analysis. Notably, the integration of semantic, spatial, and topological information is becoming increasingly important for understanding and representing complex relationships in visual data.
Noteworthy papers in this area include: DiffDecompose, which introduces a novel framework for layer-wise decomposition of alpha-composited images via diffusion transformers. Unified Network-Based Representation of BIM Models proposes a method for capturing complex spatial and topological relationships between components in building information models. Point or Line? uses line-based representation for panoptic symbol spotting in CAD drawings, achieving state-of-the-art results. LayerPeeler presents an autoregressive peeling approach for layer-wise image vectorization, producing high-quality vector graphics with complete paths and coherent layer structures.