TY - GEN
T1 - High performance code generation for stencil computation on heterogeneous multi-device architectures
AU - Li, Pei
AU - Brunet, Elisabeth
AU - Namyst, Raymond
PY - 2014/1/1
Y1 - 2014/1/1
N2 - Heterogeneous archioectures have been widely used in the domain of high performance computing. On one hand, it allows a designer to use multiple types of computing units and each able to execute the tasks that it is best suited for to increase performance, on the other hand, it brings many challenges in programming for novice users, especially for heterogeneous systems with multi-devices. In this paper, we propose the code generator STEPOCL that generates OpenCL host program for heterogeneous multi-device architecture. In order to simplify the analyzing process, we ask user to provide the description of input and kernel parameters in an XML file, then our generator analyzes the description and generates automatically the host program. Due to the data partition and data exchange strategies, the generated host program can be executed on multi-devices without changing any kernel code. The experiment of iterative stencil loop code (ISL) shows that our tool is efficient. It guarantees the minimum data exchanges and achieves high performance on heterogeneous multi-device architecture.
AB - Heterogeneous archioectures have been widely used in the domain of high performance computing. On one hand, it allows a designer to use multiple types of computing units and each able to execute the tasks that it is best suited for to increase performance, on the other hand, it brings many challenges in programming for novice users, especially for heterogeneous systems with multi-devices. In this paper, we propose the code generator STEPOCL that generates OpenCL host program for heterogeneous multi-device architecture. In order to simplify the analyzing process, we ask user to provide the description of input and kernel parameters in an XML file, then our generator analyzes the description and generates automatically the host program. Due to the data partition and data exchange strategies, the generated host program can be executed on multi-devices without changing any kernel code. The experiment of iterative stencil loop code (ISL) shows that our tool is efficient. It guarantees the minimum data exchanges and achieves high performance on heterogeneous multi-device architecture.
KW - Code generation
KW - GPGPUs
KW - Heterogeneous architectures
KW - Multi-device
KW - OpenCL
KW - Stencil computations
U2 - 10.1109/HPCC.and.EUC.2013.213
DO - 10.1109/HPCC.and.EUC.2013.213
M3 - Conference contribution
AN - SCOPUS:84904022726
SN - 9780769550886
T3 - Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013
SP - 1512
EP - 1518
BT - Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013
PB - IEEE Computer Society
T2 - 15th IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 11th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, EUC 2013
Y2 - 13 November 2013 through 15 November 2013
ER -