TY - GEN
T1 - Relative Performance Projection on Arm Architectures
AU - Gavoille, Clément
AU - Taboada, Hugo
AU - Carribault, Patrick
AU - Dupros, Fabrice
AU - Goglin, Brice
AU - Jeannot, Emmanuel
N1 - Publisher Copyright:
© 2022, Springer Nature Switzerland AG.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - With the advent of multi- many-core processors and hardware accelerators, choosing a specific architecture to renew a supercomputer can become very tedious. This decision process should consider the current and future parallel application needs and the design of the target software stack. It should also consider the single-core behavior of the application as it is one of the performance limitations in today’s machines. In such a scheme, performance hints on the impact of some hardware and software stack modifications are mandatory to drive this choice. This paper proposes a workflow for performance projection based on execution on an actual processor and the application’s behavior. This projection evaluates the performance variation from an existing core of a processor to a hypothetical one to drive the design choice. For this purpose, we characterize the maximum sustainable performance of the target machine and analyze the application using the software stack of the target machine. To validate this approach, we apply it to three applications of the CORAL benchmark suite: LULESH, MiniFE, and Quicksilver, using a single-core of two Arm-based architectures: Marvell ThunderX2 and Arm Neoverse N1. Finally, we follow this validation work with an example of design-space exploration around the SVE vector size, the choice of DDR4 and HBM2, and the software stack choice on A64FX on our applications with a pool of three source architectures: Arm Neoverse N1, Marvell ThunderX2, and Fujitsu A64FX.
AB - With the advent of multi- many-core processors and hardware accelerators, choosing a specific architecture to renew a supercomputer can become very tedious. This decision process should consider the current and future parallel application needs and the design of the target software stack. It should also consider the single-core behavior of the application as it is one of the performance limitations in today’s machines. In such a scheme, performance hints on the impact of some hardware and software stack modifications are mandatory to drive this choice. This paper proposes a workflow for performance projection based on execution on an actual processor and the application’s behavior. This projection evaluates the performance variation from an existing core of a processor to a hypothetical one to drive the design choice. For this purpose, we characterize the maximum sustainable performance of the target machine and analyze the application using the software stack of the target machine. To validate this approach, we apply it to three applications of the CORAL benchmark suite: LULESH, MiniFE, and Quicksilver, using a single-core of two Arm-based architectures: Marvell ThunderX2 and Arm Neoverse N1. Finally, we follow this validation work with an example of design-space exploration around the SVE vector size, the choice of DDR4 and HBM2, and the software stack choice on A64FX on our applications with a pool of three source architectures: Arm Neoverse N1, Marvell ThunderX2, and Fujitsu A64FX.
KW - Arm architecture
KW - Design space exploration
KW - Performance Projection
KW - Roofline model
U2 - 10.1007/978-3-031-12597-3_6
DO - 10.1007/978-3-031-12597-3_6
M3 - Conference contribution
AN - SCOPUS:85135839457
SN - 9783031125966
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 85
EP - 99
BT - Euro-Par 2022
A2 - Cano, José
A2 - Trinder, Phil
PB - Springer Science and Business Media Deutschland GmbH
T2 - 28th International European Conference on Parallel and Distributed Computing, Euro-Par 2022
Y2 - 22 August 2022 through 26 August 2022
ER -