TY - GEN
T1 - A study of the scalability of stop-the-world garbage collectors on multicores
AU - Gidra, Lokesh
AU - Thomas, Gaël
AU - Sopena, Julien
AU - Shapiro, Marc
PY - 2013/3/16
Y1 - 2013/3/16
N2 - Large-scale multicore architectures create new challenges for garbage collectors (GCs). In particular, throughput-oriented stopthe- world algorithms demonstrate good performance with a small number of cores, but have been shown to degrade badly beyond approximately 8 cores on a 48-core with OpenJDK 7. This negative result raises the question whether the stop-the-world design has intrinsic limitations that would require a radically different approach. Our study suggests that the answer is no, and that there is no compelling scalability reason to discard the existing highly-optimised throughput-oriented GC code on contemporary hardware. This paper studies the default throughput-oriented garbage collector of OpenJDK 7, called Parallel Scavenge. We identify its bottlenecks, and show how to eliminate them using well-established parallel programming techniques. On the SPECjbb2005, SPECjvm2008 and DaCapo 9.12 benchmarks, the improved GC matches the performance of Parallel Scavenge at low core count, but scales well, up to 48 cores. Categories and Subject Descriptors D.4.2 [Software]: Garbage collection General Terms Experimentation, Performance.
AB - Large-scale multicore architectures create new challenges for garbage collectors (GCs). In particular, throughput-oriented stopthe- world algorithms demonstrate good performance with a small number of cores, but have been shown to degrade badly beyond approximately 8 cores on a 48-core with OpenJDK 7. This negative result raises the question whether the stop-the-world design has intrinsic limitations that would require a radically different approach. Our study suggests that the answer is no, and that there is no compelling scalability reason to discard the existing highly-optimised throughput-oriented GC code on contemporary hardware. This paper studies the default throughput-oriented garbage collector of OpenJDK 7, called Parallel Scavenge. We identify its bottlenecks, and show how to eliminate them using well-established parallel programming techniques. On the SPECjbb2005, SPECjvm2008 and DaCapo 9.12 benchmarks, the improved GC matches the performance of Parallel Scavenge at low core count, but scales well, up to 48 cores. Categories and Subject Descriptors D.4.2 [Software]: Garbage collection General Terms Experimentation, Performance.
KW - Garbage collection
KW - Multicore
KW - NUMA
U2 - 10.1145/2451116.2451142
DO - 10.1145/2451116.2451142
M3 - Conference contribution
AN - SCOPUS:84875677989
SN - 9781450318709
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 229
EP - 239
BT - ASPLOS 2013 - 18th International Conference on Architectural Support for Programming Languages and Operating Systems
PB - Association for Computing Machinery
T2 - 18th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2013
Y2 - 16 March 2013 through 20 March 2013
ER -