Efficient and Robust Models for Edge Devices

The field of computer vision and machine learning is moving towards the development of efficient and robust models that can be deployed on edge devices. Researchers are focusing on creating lightweight models that can perform tasks such as object tracking, material perception, and image classification in real-time, while also being robust to adversarial attacks and other forms of noise. This is being achieved through the use of techniques such as knowledge distillation, pruning, and quantization, which enable the compression of large models into smaller, more efficient ones without sacrificing accuracy. Additionally, researchers are exploring new architectures and training methods that can adapt to the constraints of edge devices, such as limited computational resources and memory. Noteworthy papers in this area include the development of a lightweight RGB object tracking algorithm for augmented reality devices, which achieves state-of-the-art accuracy while running in real-time on a mobile AR headset. Another notable paper proposes a novel feature-based knowledge distillation framework that leverages cross-attention mechanisms to enhance the knowledge transfer process, outperforming existing attention-guided distillation methods on object detection and image segmentation tasks.

Sources

Deep Learning-based Lightweight RGB Object Tracking for Augmented Reality Devices

Vulnerability-Aware Robust Multimodal Adversarial Training

CrossJEPA: Cross-Modal Joint-Embedding Predictive Architecture for Efficient 3D Representation Learning from 2D Images

Uncertainty-Aware Dual-Student Knowledge Distillation for Efficient Image Classification

Towards Characterizing Knowledge Distillation of PPG Heart Rate Estimation Models

From Raw Features to Effective Embeddings: A Three-Stage Approach for Multimodal Recipe Recommendation

TouchFormer: A Robust Transformer-based Framework for Multimodal Material Perception

Towards Edge General Intelligence: Knowledge Distillation for Mobile Agentic AI

Foundry: Distilling 3D Foundation Models for the Edge

One Patch is All You Need: Joint Surface Material Reconstruction and Classification from Minimal Visual Cues

FANoise: Singular Value-Adaptive Noise Modulation for Robust Multimodal Representation Learning

CanKD: Cross-Attention-based Non-local operation for Feature-based Knowledge Distillation

Multimodal Robust Prompt Distillation for 3D Point Cloud Models

Continual Error Correction on Low-Resource Devices

Built with on top of