TY - GEN
T1 - Ranked enumeration of MSO Logic on Words
AU - Bourhis, Pierre
AU - Grez, Alejandro
AU - Jachiet, Louis
AU - Riveros, Cristian
N1 - Publisher Copyright:
© Pierre Bourhis, Alejandro Grez, Louis Jachiet, and Cristian Riveros.
PY - 2021/3/1
Y1 - 2021/3/1
N2 - In the last years, enumeration algorithms with bounded delay have attracted a lot of attention for several data management tasks. Given a query and the data, the task is to preprocess the data and then enumerate all the answers to the query one by one and without repetitions. This enumeration scheme is typically useful when the solutions are treated on the fly or when we want to stop the enumeration once the pertinent solutions have been found. However, with the current schemes, there is no restriction on the order how the solutions are given and this order usually depends on the techniques used and not on the relevance for the user. In this paper we study the enumeration of monadic second order logic (MSO) over words when the solutions are ranked. We present a framework based on MSO cost functions that allows to express MSO formulae on words with a cost associated with each solution. We then demonstrate the generality of our framework which subsumes, for instance, document spanners and adds ranking to them. The main technical result of the paper is an algorithm for enumerating all the solutions of formulae in increasing order of cost efficiently, namely, with a linear preprocessing phase and logarithmic delay between solutions. The novelty of this algorithm is based on using functional data structures, in particular, by extending functional Brodal queues to suit with the ranked enumeration of MSO on words.
AB - In the last years, enumeration algorithms with bounded delay have attracted a lot of attention for several data management tasks. Given a query and the data, the task is to preprocess the data and then enumerate all the answers to the query one by one and without repetitions. This enumeration scheme is typically useful when the solutions are treated on the fly or when we want to stop the enumeration once the pertinent solutions have been found. However, with the current schemes, there is no restriction on the order how the solutions are given and this order usually depends on the techniques used and not on the relevance for the user. In this paper we study the enumeration of monadic second order logic (MSO) over words when the solutions are ranked. We present a framework based on MSO cost functions that allows to express MSO formulae on words with a cost associated with each solution. We then demonstrate the generality of our framework which subsumes, for instance, document spanners and adds ranking to them. The main technical result of the paper is an algorithm for enumerating all the solutions of formulae in increasing order of cost efficiently, namely, with a linear preprocessing phase and logarithmic delay between solutions. The novelty of this algorithm is based on using functional data structures, in particular, by extending functional Brodal queues to suit with the ranked enumeration of MSO on words.
KW - Enumeration algorithms
KW - Persistent data structures
KW - Query evaluation
U2 - 10.4230/LIPIcs.ICDT.2021.20
DO - 10.4230/LIPIcs.ICDT.2021.20
M3 - Conference contribution
AN - SCOPUS:85106668500
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 24th International Conference on Database Theory, ICDT 2021
A2 - Yi, Ke
A2 - Wei, Zhewei
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 24th International Conference on Database Theory, ICDT 2021
Y2 - 23 March 2021 through 26 March 2021
ER -