TY - GEN
T1 - Anytime Large-Scale Analytics of Linked Open Data
AU - Soulet, Arnaud
AU - Suchanek, Fabian M.
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Analytical queries are queries with numerical aggregators: computing the average number of objects per property, identifying the most frequent subjects, etc. Such queries are essential to monitor the quality and the content of the Linked Open Data (LOD) cloud. Many analytical queries cannot be executed directly on the SPARQL endpoints, because the fair use policy cuts off expensive queries. In this paper, we show how to rewrite such queries into a set of queries that each satisfy the fair use policy. We then show how to execute these queries in such a way that the result provably converges to the exact query answer. Our algorithm is an anytime algorithm, meaning that it can give intermediate approximate results at any time point. Our experiments show that the approach converges rapidly towards the exact solution, and that it can compute even complex indicators at the scale of the LOD cloud.
AB - Analytical queries are queries with numerical aggregators: computing the average number of objects per property, identifying the most frequent subjects, etc. Such queries are essential to monitor the quality and the content of the Linked Open Data (LOD) cloud. Many analytical queries cannot be executed directly on the SPARQL endpoints, because the fair use policy cuts off expensive queries. In this paper, we show how to rewrite such queries into a set of queries that each satisfy the fair use policy. We then show how to execute these queries in such a way that the result provably converges to the exact query answer. Our algorithm is an anytime algorithm, meaning that it can give intermediate approximate results at any time point. Our experiments show that the approach converges rapidly towards the exact solution, and that it can compute even complex indicators at the scale of the LOD cloud.
U2 - 10.1007/978-3-030-30793-6_33
DO - 10.1007/978-3-030-30793-6_33
M3 - Conference contribution
AN - SCOPUS:85075748702
SN - 9783030307929
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 576
EP - 592
BT - The Semantic Web – ISWC 2019 - 18th International Semantic Web Conference, Proceedings
A2 - Ghidini, Chiara
A2 - Hartig, Olaf
A2 - Maleshkova, Maria
A2 - Svátek, Vojtech
A2 - Cruz, Isabel
A2 - Hogan, Aidan
A2 - Song, Jie
A2 - Lefrançois, Maxime
A2 - Gandon, Fabien
PB - Springer
T2 - 18th International Semantic Web Conference, ISWC 2019
Y2 - 6 February 2017 through 30 October 2019
ER -