TY - GEN
T1 - A study on the impact of the distance types involved in protein structure determination by NMR
AU - Hengeveld, Simon B.
AU - Malliavin, T.
AU - Lin, J. H.
AU - Liberti, L.
AU - Mucherino, A.
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - The Distance Geometry Problem (DGP) consists of finding the coordinates of a given set of points where the distances between some pairs of points are known. The DGP has several applications and one of the most relevant ones arises in the context of structural biology, where NMR experiments are performed to estimate distances between some atom pairs in a given molecule, and the possible conformations for the molecule are calculated through the formulation and the solution of a DGP. We focus our attention on DGP instances for which some special assumptions allow us to discretize the DGP search space and to potentially perform the complete enumeration of the solution set. We refer to the subclass of DGP instances satisfying such discretizability assumptions as the Discretizable DGP (DDGP). In this context, we propose a new procedure for the generation of DDGP instances where real data and simulated data (from known molecular models) can coexist. Our procedure can give rise to peculiar DDGP instances that we use for studying the impact of every distance type, involved in NMR protein structure determination, on the quality of the found solutions. Surprisingly, our experiments suggest that the distance types implying a larger effect on the solution quality are not the ones related to NMR data, but rather the more abundant, but much less informative, van der Waals distance type.
AB - The Distance Geometry Problem (DGP) consists of finding the coordinates of a given set of points where the distances between some pairs of points are known. The DGP has several applications and one of the most relevant ones arises in the context of structural biology, where NMR experiments are performed to estimate distances between some atom pairs in a given molecule, and the possible conformations for the molecule are calculated through the formulation and the solution of a DGP. We focus our attention on DGP instances for which some special assumptions allow us to discretize the DGP search space and to potentially perform the complete enumeration of the solution set. We refer to the subclass of DGP instances satisfying such discretizability assumptions as the Discretizable DGP (DDGP). In this context, we propose a new procedure for the generation of DDGP instances where real data and simulated data (from known molecular models) can coexist. Our procedure can give rise to peculiar DDGP instances that we use for studying the impact of every distance type, involved in NMR protein structure determination, on the quality of the found solutions. Surprisingly, our experiments suggest that the distance types implying a larger effect on the solution quality are not the ones related to NMR data, but rather the more abundant, but much less informative, van der Waals distance type.
U2 - 10.1109/BIBM52615.2021.9669336
DO - 10.1109/BIBM52615.2021.9669336
M3 - Conference contribution
AN - SCOPUS:85125181742
T3 - Proceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
SP - 2502
EP - 2510
BT - Proceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
A2 - Huang, Yufei
A2 - Kurgan, Lukasz
A2 - Luo, Feng
A2 - Hu, Xiaohua Tony
A2 - Chen, Yidong
A2 - Dougherty, Edward
A2 - Kloczkowski, Andrzej
A2 - Li, Yaohang
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
Y2 - 9 December 2021 through 12 December 2021
ER -