Path summaries and path partitioning in modern XML databases

Research output: Contribution to journalArticlepeer-review

Abstract

XML path summaries are compact structures representing all the simple parent-child paths of an XML document. Such paths have also been used in many works as a basis for partitioning the document's content in a persistent store, under the form of path indices or path tables. We revisit the notions of path summaries and path-driven storage model in the context of current-day XML databases. This context is characterized by complex queries, typically expressed in an XQuery subset, and by the presence of efficient encoding techniques such as structural node identifiers. We review a path summary's many uses for query optimization, and given them a common basis, namely relevant paths. We discuss summary-based tree pattern minimization and present some efficient summary-based minimization heuristics. We consider relevant path computation and provide a time- and memory-efficient computation algorithm. We combine the principle of path partitioning with the presence of structural identifiers in a simple path-partitioned storage model, which allows for selective data access and efficient query plans. This model improves the efficiency of twig query processing up to two orders of magnitude over the similar tag-partitioned indexing model. We have implemented the path-partitioned storage model and path summaries in the XQueC compressed database prototype [8]. We present an experimental evaluation of a path summary's practical feasibility and of tree pattern matching in a path-partitioned store.

Original languageEnglish
Pages (from-to)117-151
Number of pages35
JournalWorld Wide Web
Volume11
Issue number1
DOIs
Publication statusPublished - 1 Mar 2008
Externally publishedYes

Keywords

  • Path partitioning
  • Path summaries
  • XML databases

Fingerprint

Dive into the research topics of 'Path summaries and path partitioning in modern XML databases'. Together they form a unique fingerprint.

Cite this