Résumé
We improve the efficiency of algorithms for stochastic combinatorial semi-bandits. In most interesting problems, state-of-the-art algorithms take advantage of structural properties of rewards, such as independence. However, while being optimal in terms of asymptotic regret, these al-gorithms are inefficient. In our paper, we first reduce their implementation to a specific submod-ular maximization. Then, in case of matroid con-straints, we design adapted approximation rou-tines, thereby providing the first efficient algo-rithms that rely on reward structure to improve regret bound. In particular, we improve the state-of-the-art efficient gap-free regret bound by a fac-tor√m/ log m, where m is the maximum action size. Finally, we show how our improvement translates to more general budgeted combinato-rial semi-bandits.
| langue originale | Anglais |
|---|---|
| Pages (de - à) | 5123-5132 |
| Nombre de pages | 10 |
| journal | Proceedings of Machine Learning Research |
| Volume | 97 |
| état | Publié - 1 janv. 2019 |
| Modification externe | Oui |
| Evénement | 36th International Conference on Machine Learning, ICML 2019 - Long Beach, États-Unis Durée: 9 juin 2019 → 15 juin 2019 |
Empreinte digitale
Examiner les sujets de recherche de « Exploiting Structure of Uncertainty for Efficient Matroid Semi-Bandits ». Ensemble, ils forment une empreinte digitale unique.Contient cette citation
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver