Passer à la navigation principale Passer à la recherche Passer au contenu principal

Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents

  • Safwan Labbi
  • , Daniil Tiapkin
  • , Lorenzo Mancini
  • , Paul Mangold
  • , Eric Moulines

Résultats de recherche: Contribution à un journalArticle de conférenceRevue par des pairs

Résumé

In this paper, we present the Federated Upper Confidence Bound Value Iteration algorithm (Fed-UCBVI), a novel extension of the UCBVI algorithm (Azar et al., 2017) tailored for the federated learning framework. We prove that the regret of Fed-UCBVI scales as Õ(√H3|S||A|T/M), with a small additional term due to heterogeneity, where |S| is the number of states, |A| is the number of actions, H is the episode length, M is the number of agents, and T is the number of episodes. Notably, in the single-agent setting, this upper bound matches the minimax lower bound up to polylogarithmic factors, while in the multi-agent scenario, Fed-UCBVI has linear speed-up. To conduct our analysis, we introduce a new measure of heterogeneity, which may hold independent theoretical interest. Furthermore, we show that, unlike existing federated reinforcement learning approaches, Fed-UCBVI's communication complexity only marginally increases with the number of agents.

langue originaleAnglais
Pages (de - à)1315-1323
Nombre de pages9
journalProceedings of Machine Learning Research
Volume258
étatPublié - 1 janv. 2025
Evénement28th International Conference on Artificial Intelligence and Statistics, AISTATS 2025 - Mai Khao, Thadlande
Durée: 3 mai 20255 mai 2025

Empreinte digitale

Examiner les sujets de recherche de « Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation