TY - GEN
T1 - Adaptive OpenMP for large NUMA nodes
AU - Mahéo, Aurèle
AU - Koliaï, Souad
AU - Carribault, Patrick
AU - Pérache, Marc
AU - Jalby, William
PY - 2012/6/18
Y1 - 2012/6/18
N2 - The advent of multicore processors advocates for a hybrid programming model like MPI+OpenMP. Therefore, OpenMP runtimes require solid performance from a small number of threads (one MPI task per socket, OpenMP inside each socket) to a large number of threads (one MPI task per node, OpenMP inside each node). To tackle this issue, we propose a mechanism to improve performance of thread synchronization with a large spectrum of threads. It relies on a hierarchical tree traversed in a different manner according to the number of threads inside the parallel region. Our approach exposes high performance for thread activation (parallel construct) and thread synchronization (barrier construct). Several papers study hierarchical structures to launch and synchronize OpenMP threads [1, 2]. They tested tree-based approaches to distribute and synchronize threads, but they do not explore mixed hierarchical solutions.
AB - The advent of multicore processors advocates for a hybrid programming model like MPI+OpenMP. Therefore, OpenMP runtimes require solid performance from a small number of threads (one MPI task per socket, OpenMP inside each socket) to a large number of threads (one MPI task per node, OpenMP inside each node). To tackle this issue, we propose a mechanism to improve performance of thread synchronization with a large spectrum of threads. It relies on a hierarchical tree traversed in a different manner according to the number of threads inside the parallel region. Our approach exposes high performance for thread activation (parallel construct) and thread synchronization (barrier construct). Several papers study hierarchical structures to launch and synchronize OpenMP threads [1, 2]. They tested tree-based approaches to distribute and synchronize threads, but they do not explore mixed hierarchical solutions.
UR - https://www.scopus.com/pages/publications/84862180635
U2 - 10.1007/978-3-642-30961-8_20
DO - 10.1007/978-3-642-30961-8_20
M3 - Conference contribution
AN - SCOPUS:84862180635
SN - 9783642309601
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 254
EP - 257
BT - OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, IWOMP 2012, Proceedings
T2 - 8th International Workshop on OpenMP, IWOMP 2012
Y2 - 11 June 2012 through 13 June 2012
ER -