Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing

  • Chenghao Lyu
  • , Qi Fan
  • , Fei Song
  • , Arnab Sinha
  • , Yanlei Diao
  • , Wei Chen
  • , Li Ma
  • , Yihui Feng
  • , Yaliang Li
  • , Kai Zeng
  • , Jingren Zhou

Research output: Contribution to journalConference articlepeer-review

Abstract

Big data processing at the production scale presents a highly complex environment for resource optimization (RO), a problem crucial for meeting performance goals and budgetary constraints of analytical users. The RO problem is challenging because it involves a set of decisions (the partition count, placement of parallel instances on machines, and resource allocation to each instance), requires multi-objective optimization (MOO), and is compounded by the scale and complexity of big data systems while having to meet stringent time constraints for scheduling. This paper presents a MaxCompute based integrated system to support multi-objective resource optimization via fine-grained instance-level modeling and optimization. We propose a new architecture that breaks RO into a series of simpler problems, new fine-grained predictive models, and novel optimization methods that exploit these models to make effective instance-level RO decisions well under a second. Evaluation using production workloads shows that our new RO system could reduce 37-72% latency and 43-78% cost at the same time, compared to the current optimizer and scheduler, while running in 0.02-0.23s.

Original languageEnglish
Pages (from-to)3098-3111
Number of pages14
JournalProceedings of the VLDB Endowment
Volume15
Issue number11
DOIs
Publication statusPublished - 1 Jan 2022
Externally publishedYes
Event48th International Conference on Very Large Data Bases, VLDB 2022 - Sydney, Australia
Duration: 5 Sept 20229 Sept 2022

Fingerprint

Dive into the research topics of 'Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing'. Together they form a unique fingerprint.

Cite this