The field of neural architecture design is moving towards more efficient and optimized models. Recent developments have focused on improving the performance of vision-language models, neural architecture search, and transformer architectures. Researchers are exploring new techniques such as layer skipping, compute-in-memory-aware neural architecture search, and collaborative large language models to achieve better efficiency and accuracy. Noteworthy papers in this area include:
- CIMNAS, which introduces a joint framework for compute-in-memory-aware neural architecture search and achieves significant reductions in energy-delay-area product.
- CoLLM-NAS, which presents a collaborative large language model-based neural architecture search framework that efficiently guides the search process and achieves state-of-the-art results.
- Cutting the Skip, which enables stable and efficient training of skipless transformers and opens new avenues for hierarchical representation learning in vision models.
- Composer, which discovers new hybrid LLM architectures that outperform existing models and improve training and inference efficiency.
- GLAI, which introduces a new architectural block that separates structural and quantitative knowledge and achieves more efficient training processes.
- PEL-NAS, which proposes a search space partitioned architecture prompt co-evolutionary LLM-driven hardware-aware neural architecture search that generates neural networks with high accuracy and low latency.
- Rethinking the shape convention of an MLP, which challenges the conventional narrow-wide-narrow design and proposes a wide-narrow-wide Hourglass MLP block that achieves superior performance-parameter Pareto frontiers.