Abstract
M-estimators are ubiquitous in machine learning and statistical learning theory. They are used both for defining prediction strategies and for evaluating their precision. In this paper, we propose the first non-asymptotic “any-time” deviation bounds for general Mestimators, where “any-time” means that the bound holds with a prescribed probability for every sample size. These bounds are nonasymptotic versions of the law of iterated logarithm. They are established under general assumptions such as Lipschitz continuity of the loss function and (local) curvature of the population risk. These conditions are satisfied for most examples used in machine learning, including those ensuring robustness to outliers and to heavy-tailed distributions. As an example of application, we consider the problem of best arm identification in a stochastic multi-armed bandit setting. We show that the established bound can be converted into a new algorithm, with provably optimal theoretical guarantees. Numerical experiments illustrating the validity of the algorithm are reported.
| Original language | English |
|---|---|
| Pages (from-to) | 1331-1341 |
| Number of pages | 11 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 108 |
| Publication status | Published - 1 Jan 2020 |
| Event | 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020 - Virtual, Online Duration: 26 Aug 2020 → 28 Aug 2020 |