Advances in Multi-Armed Bandits and Algorithm Configuration

The field of multi-armed bandits and algorithm configuration is rapidly advancing, with a focus on developing more efficient and effective algorithms for real-world applications. Recent research has explored the use of exploration-free algorithms, which have been shown to be effective in multi-group mean estimation and other settings. Additionally, there has been a growing interest in developing algorithms that can handle complex constraints and nonlinear relationships, such as those found in contextual bandit settings. Notable papers in this area include the development of provable anytime ensemble sampling algorithms and the proposal of a new transformation for improving the convergence rate of the Sinc approximation. Furthermore, researchers have been working on improving the efficiency and effectiveness of algorithm configuration methods, including the development of utilitarian algorithm configuration procedures and the application of Thompson sampling to large language models. Overall, these advances have the potential to significantly improve the performance and efficiency of a wide range of applications, from experimental design to decision-making under uncertainty. Noteworthy papers include: Exploration-free Algorithms for Multi-group Mean Estimation, which strengthens existing results on subgaussian variance concentration and proposes exploration-free non-adaptive and adaptive algorithms. Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits, which develops a unified algorithmic framework for ensemble sampling and establishes high-probability frequentist regret bounds.

Sources

Exploration-free Algorithms for Multi-group Mean Estimation

Discovering interpretable piecewise nonlinear model predictive control laws via symbolic decision trees

Budget Allocation for Unknown Value Functions in a Lipschitz Space

Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits

DE-Sinc approximation for unilateral rapidly decreasing functions and its computational error bound

Multi-Armed Bandits with Minimum Aggregated Revenue Constraints

Sample-Efficient Omniprediction for Proper Losses

Thompson Sampling via Fine-Tuning of LLMs

LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization

Practical, Utilitarian Algorithm Configuration

Error analysis of Abate--Whitt methods for Inverse Laplace Transforms and a new algorithm for queuing theory applications

Prediction-Specific Design of Learning-Augmented Algorithms

Built with on top of