TY - GEN
T1 - Shift-collapse acceleration of generalized polarizable reactive molecular dynamics for machine learning-assisted computational synthesis of layered materials
AU - Liu, Kuang
AU - Tiwari, Subodh
AU - Sheng, Chunyang
AU - Krishnamoorthy, Aravind
AU - Hong, Sungwook
AU - Rajak, Pankaj
AU - Kalia, Rajiv K.
AU - Nakano, Aiichiro
AU - Nomura, Ken Ichi
AU - Vashishta, Priya
AU - Kunaseth, Manaschai
AU - Naserifar, Saber
AU - Goddard, William A.
AU - Luo, Ye
AU - Romero, Nichols A.
AU - Shimojo, Fuyuki
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - Reactive molecular dynamics is a powerful simulation method for describing chemical reactions. Here, we introduce a new generalized polarizable reactive force-field (ReaxPQ+) model to significantly improve the accuracy by accommodating the reorganization of surrounding media. The increased computation is accelerated by (1) extended Lagrangian approach to eliminate the speed-limiting charge iteration, (2) shift-collapse computation of many-body renormalized n-tuples, which provably minimizes data transfer, (3) multithreading with round-robin data privatization, and (4) data reordering to reduce computation and allow vectorization. The new code achieves (1) weak-scaling parallel efficiency of 0.989 for 131,072 cores, and (2) eight-fold reduction of time-to-solution (T2S) compared with the original code, on an Intel Knights Landing-based computer. The reduced T2S has for the first time allowed purely computational synthesis of atomically-thin transition metal dichalcogenide layers assisted by machine learning to discover a novel synthetic pathway.
AB - Reactive molecular dynamics is a powerful simulation method for describing chemical reactions. Here, we introduce a new generalized polarizable reactive force-field (ReaxPQ+) model to significantly improve the accuracy by accommodating the reorganization of surrounding media. The increased computation is accelerated by (1) extended Lagrangian approach to eliminate the speed-limiting charge iteration, (2) shift-collapse computation of many-body renormalized n-tuples, which provably minimizes data transfer, (3) multithreading with round-robin data privatization, and (4) data reordering to reduce computation and allow vectorization. The new code achieves (1) weak-scaling parallel efficiency of 0.989 for 131,072 cores, and (2) eight-fold reduction of time-to-solution (T2S) compared with the original code, on an Intel Knights Landing-based computer. The reduced T2S has for the first time allowed purely computational synthesis of atomically-thin transition metal dichalcogenide layers assisted by machine learning to discover a novel synthetic pathway.
KW - Computational-materials-science-and-engineering,-Hybrid/heterogeneous/accelerated-algorithms-and-other-high-performance-algorithms
UR - https://www.scopus.com/pages/publications/85063103328
U2 - 10.1109/ScalA.2018.00009
DO - 10.1109/ScalA.2018.00009
M3 - Conference contribution
AN - SCOPUS:85063103328
T3 - Proceedings of ScalA 2018: 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Held in conjunction with SC 2018: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 41
EP - 48
BT - Proceedings of ScalA 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2018
Y2 - 12 November 2018
ER -