Inference of compressed Potts graphical models

  • Francesca Rizzato
  • , Alice Coucke
  • , Eleonora De Leonardis
  • , John P. Barton
  • , Jérôme Tubiana
  • , Rémi Monasson
  • , Simona Cocco

Research output: Contribution to journalArticlepeer-review

Abstract

We consider the problem of inferring a graphical Potts model on a population of variables. This inverse Potts problem generally involves the inference of a large number of parameters, often larger than the number of available data, and, hence, requires the introduction of regularization. We study here a double regularization scheme, in which the number of Potts states (colors) available to each variable is reduced and interaction networks are made sparse. To achieve the color compression, only Potts states with large empirical frequency (exceeding some threshold) are explicitly modeled on each site, while the others are grouped into a single state. We benchmark the performances of this mixed regularization approach, with two inference algorithms, adaptive cluster expansion (ACE) and pseudolikelihood maximization (PLM), on synthetic data obtained by sampling disordered Potts models on Erdos-Rényi random graphs. We show in particular that color compression does not affect the quality of reconstruction of the parameters corresponding to high-frequency symbols, while drastically reducing the number of the other parameters and thus the computational time. Our procedure is also applied to multisequence alignments of protein families, with similar results.

Original languageEnglish
Article number012309
JournalPhysical Review E
Volume101
Issue number1
DOIs
Publication statusPublished - 23 Jan 2020

Fingerprint

Dive into the research topics of 'Inference of compressed Potts graphical models'. Together they form a unique fingerprint.

Cite this