Functional effects of mutations in proteins can be predicted and interpreted by guided selection of sequence covariation information

Research output: Contribution to journalArticlepeer-review

Abstract

Predicting the effects of one or more mutations to the in vivo or in vitro properties of a wild-type protein is a major computational challenge, due to the presence of epistasis, that is, of interactions between amino acids in the sequence. We introduce a computationally efficient procedure to build minimal epistatic models to predict mutational effects by combining evolutionary (homologous sequence) and few mutational-scan data. Mutagenesis measurements guide the selection of links in a sparse graphical model, while the parameters on the nodes and the edges are inferred from sequence data. We show, on 10 mutational scans, that our pipeline exhibits performances comparable to state-of-the-art deep networks trained on many more data, while requiring much less parameters and being hence more interpretable. In particular, the identified interactions adapt to the wild-type protein and to the fitness or biochemical property experimentally measured, mostly focus on key functional sites, and are not necessarily related to structural contacts. Therefore, our method is able to extract information relevant for one mutational experiment from homologous sequence data reflecting the multitude of structural and functional constraints acting on proteins throughout evolution.

Original languageEnglish
Article numbere2312335121
JournalProceedings of the National Academy of Sciences of the United States of America
Volume121
Issue number26
DOIs
Publication statusPublished - 1 Jun 2024

Keywords

  • fitness prediction
  • functional networks
  • inference of sparse graphical Potts models
  • multiple sequence alignments
  • mutational scans

Fingerprint

Dive into the research topics of 'Functional effects of mutations in proteins can be predicted and interpreted by guided selection of sequence covariation information'. Together they form a unique fingerprint.

Cite this