Optimization Advances in Deep Learning

The field of deep learning is witnessing significant advancements in optimization techniques, with a focus on improving generalization, robustness, and convergence. Researchers are exploring novel optimizers, such as those incorporating dynamic scaling and adaptive damping, to enhance training efficiency and stability. Additionally, there is a growing interest in analyzing and optimizing the convergence behavior of stochastic gradient descent with momentum (SGDM) under various scheduling strategies. The integration of scaling laws into deep reinforcement learning (DRL) is also being investigated, with a focus on balancing scalability with computational efficiency. Noteworthy papers in this area include: ZetA, which introduces a novel deep learning optimizer that extends Adam by incorporating dynamic scaling based on the Riemann zeta function, demonstrating improved generalization and robustness. Accelerating SGDM via Learning Rate and Batch Size Schedules, which analyzes the convergence behavior of SGDM under dynamic learning rate and batch size schedules, providing a unified theoretical foundation and practical guidance for designing efficient and stable training procedures. Neural Network Training via Stochastic Alternating Minimization with Trainable Step Sizes, which proposes a novel method that updates network parameters in an alternating manner, reducing per-step computational overhead and enhancing training stability in nonconvex settings. Optimal Growth Schedules for Batch Size and Learning Rate in SGD, which theoretically derives optimal growth schedules for the batch size and learning rate that reduce stochastic first-order oracle complexity, offering both theoretical insights and practical guidelines for scalable and efficient large-batch training in deep learning.

Optimization Advances in Deep Learning

Sources