Skip to main navigation Skip to search Skip to main content

Reuse-based optimization for pig Latin

  • Hortonworks
  • Université Paris Dauphine
  • University of Suttgart
  • Fractal Analytics

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Pig Latin is a popular language which is widely used for parallel processing of massive data sets. Currently, subexpressions occurring repeatedly in Pig Latin scripts are executed as many times as they appear, and the current Pig Latin optimizer does not identify reuse opportunities. We present a novel optimization approach aiming at identifying and reusing repeated subexpressions in Pig Latin scripts. Our optimization algorithm, named PigReuse, identifies subexpression merging opportunities, selects the best ones to execute based on a cost function, and reuses their results as needed in order to compute exactly the same output as the original scripts. Our experiments demonstrate the effectiveness of our approach.

Original languageEnglish
Title of host publicationCIKM 2016 - Proceedings of the 2016 ACM Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages2215-2220
Number of pages6
ISBN (Electronic)9781450340731
DOIs
Publication statusPublished - 24 Oct 2016
Event25th ACM International Conference on Information and Knowledge Management, CIKM 2016 - Indianapolis, United States
Duration: 24 Oct 201628 Oct 2016

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings
Volume24-28-October-2016

Conference

Conference25th ACM International Conference on Information and Knowledge Management, CIKM 2016
Country/TerritoryUnited States
CityIndianapolis
Period24/10/1628/10/16

Keywords

  • Linear programming
  • PigLatin
  • Reuse-based optimization

Fingerprint

Dive into the research topics of 'Reuse-based optimization for pig Latin'. Together they form a unique fingerprint.

Cite this