Graph-Based Approaches in Image Classification and GUI Grounding

The field of computer vision is witnessing a significant shift towards graph-based approaches, particularly in image classification and GUI grounding tasks. Researchers are exploring the potential of Graph Convolutional Networks (GCNs) and Voronoi diagrams to model complex data structures and relational data. This has led to the development of novel frameworks that leverage the strengths of GCNs and Voronoi diagrams to improve image classification accuracy and efficiency. Additionally, there is a growing interest in concept bottleneck models (CBMs) that provide explicit interpretations for deep neural networks, with recent works focusing on incorporating graph structures and locality-awareness to enhance model performance and interpretability. Noteworthy papers in this area include:

  • V2P, which proposes a Valley-to-Peak method for robust GUI grounding tasks, achieving high performance on benchmarks.
  • Graph Concept Bottleneck Models, which introduces a new variant of CBM that facilitates concept relationships through latent concept graphs, offering superior performance and interpretability.
  • Locality-aware Concept Bottleneck Model, which utilizes prototype learning to ensure accurate spatial localization of concepts, demonstrating improved localization and comparable classification performance.

Sources

V2P: From Background Suppression to Center Peaking for Robust GUI Grounding Task

Accelerating Image Classification with Graph Convolutional Neural Networks using Voronoi Diagrams

Graph Concept Bottleneck Models

Locality-aware Concept Bottleneck Model

Fast Graph Neural Network for Image Classification

Built with on top of