Web data indexing in the cloud: Efficiency and cost reductions

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

An increasing part of the world's data is either shared through the Web or directly produced through and for Web platforms, in particular using structured formats like XML or JSON. Cloud platforms are interesting candidates to handle large data repositories, due to their elastic scaling properties. Popular commercial clouds provide a variety of sub-systems and primitives for storing data in specific formats (files, key-value pairs etc.) as well as dedicated sub-systems for running and coordinating execution within the cloud. We propose an architecture for warehousing large-scale Web data, in particular XML, in a commercial cloud platform, specifically, Amazon Web Services. Since cloud users support monetary costs directly connected to their consumption of cloud resources, we focus on indexing content in the cloud. We study the applicability of several indexing strategies, and show that they lead not only to reducing query evaluation time, but also, importantly, to reducing the monetary costs associated with the exploitation of the cloud-based warehouse. Our architecture can be easily adapted to similar cloud-based complex data warehousing settings, carrying over the benefits of access path selection in the cloud.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2013
Subtitle of host publication16th International Conference on Extending Database Technology, Proceedings
Pages41-52
Number of pages12
DOIs
Publication statusPublished - 2 May 2013
Externally publishedYes
Event16th International Conference on Extending Database Technology, EDBT 2013 - Genoa, Italy
Duration: 18 Mar 201322 Mar 2013

Publication series

NameACM International Conference Proceeding Series

Conference

Conference16th International Conference on Extending Database Technology, EDBT 2013
Country/TerritoryItaly
CityGenoa
Period18/03/1322/03/13

Keywords

  • cloud computing
  • monetary cost
  • query processing
  • web data management

Fingerprint

Dive into the research topics of 'Web data indexing in the cloud: Efficiency and cost reductions'. Together they form a unique fingerprint.

Cite this