Advancements in Computer Vision

The field of computer vision is rapidly advancing, with a focus on improving the accuracy and efficiency of various tasks such as edge detection, object detection, and image segmentation. Researchers are exploring new approaches to reduce computational costs and model sizes while maintaining high accuracy, making these technologies more viable for deployment on resource-constrained devices. Notably, innovative frameworks and architectures are being proposed to enhance semantic awareness, feature discriminability, and adaptability to different resource constraints. Furthermore, there is a growing interest in developing methods for high-resolution image analysis, anomaly detection, and video action analysis, which are crucial for various applications including industrial inspection, surveillance, and autonomous driving.

Some noteworthy papers in this area include: PEdger++ proposes a collaborative learning framework for efficient edge detection, demonstrating clear improvements over existing methods. Refine-and-Contrast presents a novel framework for adaptive instance-aware BEV representations, achieving superior accuracy-computation trade-offs. HiAD introduces a general framework for high-resolution anomaly detection, showing superior performance on various benchmarks. OmViD explores omni-supervised active learning for video action detection, significantly reducing annotation costs with minimal performance loss. Multiscale Video Transformers develop efficient video transformers for class-agnostic segmentation in autonomous driving, outperforming multiscale baselines while being efficient in GPU memory and run-time.

Sources

PEdger++: Practical Edge Detection via Assembling Cross Information

Refine-and-Contrast: Adaptive Instance-Aware BEV Representations for Multi-UAV Collaborative Object Detection

SIS-Challenge: Event-based Spatio-temporal Instance Segmentation Challenge at the CVPR 2025 Event-based Vision Workshop

Towards High-Resolution Industrial Image Anomaly Detection

Generative Model-Based Feature Attention Module for Video Action Analysis

OmViD: Omni-supervised active learning for video action detection

A Survey on Video Anomaly Detection via Deep Learning: Human, Vehicle, and Environment

Reliable Smoke Detection via Optical Flow-Guided Feature Fusion and Transformer-Based Uncertainty Modeling

Multiscale Video Transformers for Class Agnostic Segmentation in Autonomous Driving

EventSSEG: Event-driven Self-Supervised Segmentation with Probabilistic Attention

Built with on top of