The field of edge AI is witnessing significant developments, driven by the need for efficient, secure, and low-latency processing of complex models on resource-constrained devices. Researchers are exploring innovative techniques to compress and optimize models, mitigate adversarial attacks, and enhance collaborative learning across distributed devices. Notably, advancements in model pruning, low-rank compression, and token communication are enabling the deployment of large language models and multimodal large models in edge environments. Furthermore, the integration of digital twin data, parameter-efficient fine-tuning, and bandwidth-adaptive token offloading is improving the performance and privacy of edge AI systems.
Some notable papers include: The paper 'Sponge Attacks on Sensing AI' presents a systematic exploration of energy-latency sponge attacks targeting sensing-based AI models and investigates model pruning as a potential defense. The 'Dynamical Low-Rank Compression of Neural Networks' paper introduces a novel spectral regularizer to control the condition number of the low-rank core in each layer, enhancing robustness under adversarial attacks. The 'PWC-MoE' framework proposes a privacy-aware wireless collaborative mixture of experts approach to balance computational cost, performance, and privacy protection under bandwidth constraints. The 'AI2MMUM' model presents a scalable, task-aware artificial intelligence-air interface multi-modal universal model for processing multi-modal data and executing diverse air interface tasks in future wireless systems.