Skip to main navigation Skip to search Skip to main content

FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings

  • Jean Ogier du Terrail
  • , Samy Safwan Ayed
  • , Edwige Cyffers
  • , Felix Grimberg
  • , Chaoyang He
  • , Regis Loeb
  • , Paul Mangold
  • , Tanguy Marchand
  • , Othmane Marfoq
  • , Erum Mushtaq
  • , Boris Muzellec
  • , Constantin Philippenko
  • , Santiago Silva
  • , Maria Teleńczuk
  • , Shadi Albarqouni
  • , Salman Avestimehr
  • , Aurélien Bellet
  • , Aymeric Dieuleveut
  • , Martin Jaggi
  • , Sai Praneeth Karimireddy
  • Marco Lorenzi, Giovanni Neglia, Marc Tommasi, Mathieu Andreux
  • Inc
  • INRIA
  • Université de Lille
  • EPFL
  • Inc.
  • University of Southern California
  • Ecole polytechnique
  • University Hospital of Bonn
  • Helmholtz Munich
  • University of California, Berkeley

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models, without centralizing data. The cross-silo FL setting corresponds to the case of few (2-50) reliable clients, each holding medium to large datasets, and is typically found in applications such as healthcare, finance, or industry. While previous works have proposed representative datasets for cross-device FL, few realistic healthcare cross-silo FL datasets exist, thereby slowing algorithmic research in this critical application. In this work, we propose a novel cross-silo dataset suite focused on healthcare, FLamby (Federated Learning AMple Benchmark of Your cross-silo strategies), to bridge the gap between theory and practice of cross-silo FL. FLamby encompasses 7 healthcare datasets with natural splits, covering multiple tasks, modalities, and data volumes, each accompanied with baseline training code. As an illustration, we additionally benchmark standard FL algorithms on all datasets. Our flexible and modular suite allows researchers to easily download datasets, reproduce results and re-use the different components for their research. FLamby is available at www.github.com/owkin/flamby.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
PublisherNeural information processing systems foundation
ISBN (Electronic)9781713871088
Publication statusPublished - 1 Jan 2022
Event36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States
Duration: 28 Nov 20229 Dec 2022

Publication series

NameAdvances in Neural Information Processing Systems
Volume35
ISSN (Print)1049-5258

Conference

Conference36th Conference on Neural Information Processing Systems, NeurIPS 2022
Country/TerritoryUnited States
CityNew Orleans
Period28/11/229/12/22

Fingerprint

Dive into the research topics of 'FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings'. Together they form a unique fingerprint.

Cite this