TY - GEN
T1 - Reuse-based optimization for pig Latin
AU - Camacho-Rodríguez, Jesús
AU - Colazzo, Dario
AU - Herschel, Melanie
AU - Manolescu, Ioana
AU - Chowdhury, Soudip Roy
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/10/24
Y1 - 2016/10/24
N2 - Pig Latin is a popular language which is widely used for parallel processing of massive data sets. Currently, subexpressions occurring repeatedly in Pig Latin scripts are executed as many times as they appear, and the current Pig Latin optimizer does not identify reuse opportunities. We present a novel optimization approach aiming at identifying and reusing repeated subexpressions in Pig Latin scripts. Our optimization algorithm, named PigReuse, identifies subexpression merging opportunities, selects the best ones to execute based on a cost function, and reuses their results as needed in order to compute exactly the same output as the original scripts. Our experiments demonstrate the effectiveness of our approach.
AB - Pig Latin is a popular language which is widely used for parallel processing of massive data sets. Currently, subexpressions occurring repeatedly in Pig Latin scripts are executed as many times as they appear, and the current Pig Latin optimizer does not identify reuse opportunities. We present a novel optimization approach aiming at identifying and reusing repeated subexpressions in Pig Latin scripts. Our optimization algorithm, named PigReuse, identifies subexpression merging opportunities, selects the best ones to execute based on a cost function, and reuses their results as needed in order to compute exactly the same output as the original scripts. Our experiments demonstrate the effectiveness of our approach.
KW - Linear programming
KW - PigLatin
KW - Reuse-based optimization
UR - https://www.scopus.com/pages/publications/84996593452
U2 - 10.1145/2983323.2983669
DO - 10.1145/2983323.2983669
M3 - Conference contribution
AN - SCOPUS:84996593452
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 2215
EP - 2220
BT - CIKM 2016 - Proceedings of the 2016 ACM Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 25th ACM International Conference on Information and Knowledge Management, CIKM 2016
Y2 - 24 October 2016 through 28 October 2016
ER -