TY - JOUR
T1 - Measuring and interpreting performances of HPC applications with dependent tasks
AU - Pereira, Romain
AU - Gautier, Thierry
AU - Roussel, Adrien
AU - Carribault, Patrick
N1 - Publisher Copyright:
© 2025
PY - 2026/1/1
Y1 - 2026/1/1
N2 - Breaking down the parallel time into work, idleness, and overheads is crucial for assessing the performance of HPC applications but is challenging to measure in runtime systems with dependent tasks. No existing tools allow its measurement accurately. This paper introduces POT: a tool-suite for parallel applications performance analysis with support for dependent tasks. We focus on its low-disturbance methodology consisting of parallel object modeling, discrete-event tracing, and post-mortem simulation-based analysis. The POT tool-suite allows the tracing and analysis of OMPT (OpenMP), PMPI (MPI) and pthreads events. The paper evaluates the accuracy of POT's analysis on LLVM and MPC-OMP implementations. It shows that measurement bias may be neglected above 16μs workload per task, portably across two architectures and OpenMP runtime systems. We also illustrate the benefits unveiled by POT post-mortem simulation approach for analyzing mixed programming models with MPI+OpenMP.
AB - Breaking down the parallel time into work, idleness, and overheads is crucial for assessing the performance of HPC applications but is challenging to measure in runtime systems with dependent tasks. No existing tools allow its measurement accurately. This paper introduces POT: a tool-suite for parallel applications performance analysis with support for dependent tasks. We focus on its low-disturbance methodology consisting of parallel object modeling, discrete-event tracing, and post-mortem simulation-based analysis. The POT tool-suite allows the tracing and analysis of OMPT (OpenMP), PMPI (MPI) and pthreads events. The paper evaluates the accuracy of POT's analysis on LLVM and MPC-OMP implementations. It shows that measurement bias may be neglected above 16μs workload per task, portably across two architectures and OpenMP runtime systems. We also illustrate the benefits unveiled by POT post-mortem simulation approach for analyzing mixed programming models with MPI+OpenMP.
KW - High performance computing
KW - MPI
KW - OpenMP
KW - Performance tool
KW - Tasks
UR - https://www.scopus.com/pages/publications/105008320133
U2 - 10.1016/j.future.2025.107933
DO - 10.1016/j.future.2025.107933
M3 - Article
AN - SCOPUS:105008320133
SN - 0167-739X
VL - 174
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
M1 - 107933
ER -