TY - GEN
T1 - Towards scalable hybrid stores
T2 - 2019 International Conference on Management of Data, SIGMOD 2019
AU - Alotaibi, Rana
AU - Bursztyn, Damian
AU - Deutsch, Alin
AU - Manolescu, Ioana
AU - Zampetakis, Stamatis
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/6/25
Y1 - 2019/6/25
N2 - Big data applications routinely involve diverse datasets: relations flat or nested, complex-structure graphs, documents, poorly structured logs, or even text data. To handle the data, application designers usually rely on several data stores used side-by-side, each capable of handling one or a few data models, and each very efficient for some, but not all, kinds of processing on the data. A current limitation is that applications are written taking into account which part of the data is stored in which store and how. This fails to take advantage of (i) possible redundancy, when the same data may be accessible (with different performance) from distinct data stores; (ii) partial query results (in the style of materialized views) which may be available in the stores. We present ESTOCADA, a novel approach connecting applications to the potentially heterogeneous systems where their input data resides. ESTOCADA can be used in a polystore setting to transparently enable each query to benefit from the best combination of stored data and available processing capabilities. ESTOCADA leverages recent advances in the area of view-based query rewriting under constraints, which we use to describe the various data models and stored data. Our experiments illustrate the significant performance gains achieved by ESTOCADA.
AB - Big data applications routinely involve diverse datasets: relations flat or nested, complex-structure graphs, documents, poorly structured logs, or even text data. To handle the data, application designers usually rely on several data stores used side-by-side, each capable of handling one or a few data models, and each very efficient for some, but not all, kinds of processing on the data. A current limitation is that applications are written taking into account which part of the data is stored in which store and how. This fails to take advantage of (i) possible redundancy, when the same data may be accessible (with different performance) from distinct data stores; (ii) partial query results (in the style of materialized views) which may be available in the stores. We present ESTOCADA, a novel approach connecting applications to the potentially heterogeneous systems where their input data resides. ESTOCADA can be used in a polystore setting to transparently enable each query to benefit from the best combination of stored data and available processing capabilities. ESTOCADA leverages recent advances in the area of view-based query rewriting under constraints, which we use to describe the various data models and stored data. Our experiments illustrate the significant performance gains achieved by ESTOCADA.
KW - Cross-model query rewriting
KW - Data models
KW - Integrity constraints
KW - Polystore systems
U2 - 10.1145/3299869.3319895
DO - 10.1145/3299869.3319895
M3 - Conference contribution
AN - SCOPUS:85069445725
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1660
EP - 1677
BT - SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data
PB - Association for Computing Machinery
Y2 - 30 June 2019 through 5 July 2019
ER -