The field of on-device intelligence is moving towards the development of compact and efficient architectures that can handle diverse machine learning tasks across multiple domains. Researchers are exploring new approaches that prioritize compactness, generalizability, and complex pattern recognition. One notable direction is the design of architectures that can adapt seamlessly across different application domains, such as regression, classification, natural language processing, and computer vision. Another area of focus is the optimization of convolutional layers for tiny FPGAs and energy-constrained CPUs, which is crucial for real-time embedded applications. Noteworthy papers include:
- CURA, which proposes a compact universal architecture that requires significantly fewer parameters and achieves superior forecasting accuracy for complex patterns.
- smallNet, which implements a convolutional layer in tiny FPGAs, demonstrating a 5.1x speedup and over 81% classification accuracy.
- Benchmarking Deep Learning Convolutions on Energy-constrained CPUs, which evaluates state-of-the-art convolution algorithms for CPU-based deep learning inference and provides practical guidance for energy-aware embedded deployment.