TY - GEN
T1 - Online A-Optimal Design and Active Linear Regression
AU - Fontaine, Xavier
AU - Perrault, Pierre
AU - Valko, Michal
AU - Perchet, Vianney
N1 - Publisher Copyright:
Copyright © 2021 by the author(s)
PY - 2021/1/1
Y1 - 2021/1/1
N2 - We consider in this paper the problem of optimal experiment design where a decision maker can choose which points to sample to obtain an estimate (equation presented) of the hidden parameter β✶ of an underlying linear model. The key challenge of this work lies in the heteroscedasticity assumption that we make, meaning that each covariate has a different and unknown variance. The goal of the decision maker is then to figure out on the fly the optimal way to allocate the total budget of T samples between covariates, as sampling several times a specific one will reduce the variance of the estimated model around it (but at the cost of a possible higher variance elsewhere). By trying to minimize the ℓ2-loss E[||(equation presented) - β✶||2] the decision maker is actually minimizing the trace of the covariance matrix of the problem, which corresponds then to online A-optimal design. Combining techniques from bandit and convex optimization we propose a new active sampling algorithm and we compare it with existing ones. We provide theoretical guarantees of this algorithm in different settings, including a O(T-2) regret bound in the case where the covariates form a basis of the feature space, generalizing and improving existing results. Numerical experiments validate our theoretical findings.
AB - We consider in this paper the problem of optimal experiment design where a decision maker can choose which points to sample to obtain an estimate (equation presented) of the hidden parameter β✶ of an underlying linear model. The key challenge of this work lies in the heteroscedasticity assumption that we make, meaning that each covariate has a different and unknown variance. The goal of the decision maker is then to figure out on the fly the optimal way to allocate the total budget of T samples between covariates, as sampling several times a specific one will reduce the variance of the estimated model around it (but at the cost of a possible higher variance elsewhere). By trying to minimize the ℓ2-loss E[||(equation presented) - β✶||2] the decision maker is actually minimizing the trace of the covariance matrix of the problem, which corresponds then to online A-optimal design. Combining techniques from bandit and convex optimization we propose a new active sampling algorithm and we compare it with existing ones. We provide theoretical guarantees of this algorithm in different settings, including a O(T-2) regret bound in the case where the covariates form a basis of the feature space, generalizing and improving existing results. Numerical experiments validate our theoretical findings.
UR - https://www.scopus.com/pages/publications/85134053262
M3 - Conference contribution
AN - SCOPUS:85134053262
T3 - Proceedings of Machine Learning Research
SP - 3374
EP - 3383
BT - Proceedings of the 38th International Conference on Machine Learning, ICML 2021
PB - ML Research Press
T2 - 38th International Conference on Machine Learning, ICML 2021
Y2 - 18 July 2021 through 24 July 2021
ER -