Joint Monitorless Load-Balancing and Autoscaling for Zero-Wait-Time in Data Centers

Yoann Desmouceaux, Marcel Enguehard, Thomas H. Clausen

Research output: Contribution to journalArticlepeer-review

Abstract

Cloud architectures achieve scaling through two main functions: (i) load-balancers, which dispatch queries among replicated virtualized application instances, and (ii) autoscalers, which automatically adjust the number of replicated instances to accommodate variations in load patterns. These functions are often provided through centralized load monitoring, incurring operational complexity. This article introduces a unified and centralized-monitoring-free architecture achieving both autoscaling and load-balancing, reducing operational overhead while increasing response time performance. Application instances are virtually ordered in a chain, and new queries are forwarded along this chain until an instance, based on its local load, accepts the query. Autoscaling is triggered by the last application instance, which inspects its average load and infers if its chain is under- or over-provisioned. An analytical model of the system is derived, and proves that the proposed technique can achieve asymptotic zero-wait time with high (and controlable) probability. This result is confirmed by extensive simulations, which highlight close-to-ideal performance in terms of both response time and resource costs.

Original languageEnglish
Article number9295352
Pages (from-to)672-686
Number of pages15
JournalIEEE Transactions on Network and Service Management
Volume18
Issue number1
DOIs
Publication statusPublished - 1 Mar 2021

Keywords

  • Load balancing
  • application-aware
  • auto-scaling
  • performance analysis
  • segment routing

Fingerprint

Dive into the research topics of 'Joint Monitorless Load-Balancing and Autoscaling for Zero-Wait-Time in Data Centers'. Together they form a unique fingerprint.

Cite this