Advances in Efficient Computer Vision Models

The field of computer vision is moving towards the development of more efficient models that can be deployed on resource-constrained devices. Recent research has focused on improving the performance of vision transformers, which have shown impressive results in various computer vision tasks. However, their computational demands often make them unsuitable for deployment on edge devices. To address this, researchers have proposed novel architectures and techniques such as knowledge distillation, pruning, and quantization to reduce the computational complexity of these models. These advancements have led to significant improvements in the efficiency and accuracy of computer vision models, enabling their deployment in real-world applications such as aerial object detection, crop monitoring, and autonomous vehicles. Notable papers in this area include CoSwin, which proposes a novel feature-fusion architecture for small-scale vision tasks, and BATR-FST, which introduces a bi-level adaptive token refinement approach for few-shot learning. Additionally, papers such as A Novel Compression Framework for YOLOv8 and Cott-ADNet have demonstrated the effectiveness of compression techniques and lightweight architectures for real-time object detection and image classification tasks.

Sources

CoSwin: Convolution Enhanced Hierarchical Shifted Window Attention For Small-Scale Vision

Adaptive Knowledge Distillation using a Device-Aware Teacher for Low-Complexity Acoustic Scene Classification

Efficient Learned Image Compression Through Knowledge Distillation

GhostNetV3-Small: A Tailored Architecture and Comparative Study of Distillation Strategies for Tiny Images

Enhancing Physical Consistency in Lightweight World Models

Cott-ADNet: Lightweight Real-Time Cotton Boll and Flower Detection Under Field Conditions

Axis-Aligned 3D Stalk Diameter Estimation from RGB-D Imagery

A Comparative Study of YOLOv8 to YOLOv11 Performance in Underwater Vision Tasks

BATR-FST: Bi-Level Adaptive Token Refinement for Few-Shot Transformers

A Novel Compression Framework for YOLOv8: Achiev-ing Real-Time Aerial Object Detection on Edge Devices via Structured Pruning and Channel-Wise Distillation

A Novel Compression Framework for YOLOv8: Achieving Real-Time Aerial Object Detection on Edge Devices via Structured Pruning and Channel-Wise Distillation

Maize Seedling Detection Dataset (MSDD): A Curated High-Resolution RGB Dataset for Seedling Maize Detection and Benchmarking with YOLOv9, YOLO11, YOLOv12 and Faster-RCNN