TY - GEN
T1 - Statistical Claim Checking
T2 - 31st ACM International Conference on Information and Knowledge Management, CIKM 2022
AU - Balalau, Oana
AU - Ebel, Simon
AU - Galizzi, Théo
AU - Manolescu, Ioana
AU - Massonnat, Quentin
AU - Deiana, Antoine
AU - Gautreau, Emilie
AU - Krempf, Antoine
AU - Pontillon, Thomas
AU - Roux, Gérald
AU - Yakin, Joanna
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - To strengthen public trust and counter disinformation, computational fact-checking, leveraging digital data sources, attracts interest from the journalists and the computer science community. A particular class of interesting data sources is statistics, that is, numerical data compiled mostly by governments, administrations, and international organizations. Statistics typically are multidimensional datasets, where multiple dimensions characterize one value, and the dimensions may be organized in a hierarchy. We developed StatCheck, a fact-checking system specialized in French. The technical novelty of StatCheck is twofold: (i) we focus on multidimensional, complex-structure statistics, which have received little attention so far, despite their practical importance; and (ii) novel statistical claim extraction modules for French, an area where few resources exist. We will demonstrate our system on large statistic datasets (hundreds of millions of facts), including the complete INSEE (French) and Eurostat (European Union) datasets. More information about StatCheckis available online at: https://team.inria.fr/cedar/projects/statcheck/.
AB - To strengthen public trust and counter disinformation, computational fact-checking, leveraging digital data sources, attracts interest from the journalists and the computer science community. A particular class of interesting data sources is statistics, that is, numerical data compiled mostly by governments, administrations, and international organizations. Statistics typically are multidimensional datasets, where multiple dimensions characterize one value, and the dimensions may be organized in a hierarchy. We developed StatCheck, a fact-checking system specialized in French. The technical novelty of StatCheck is twofold: (i) we focus on multidimensional, complex-structure statistics, which have received little attention so far, despite their practical importance; and (ii) novel statistical claim extraction modules for French, an area where few resources exist. We will demonstrate our system on large statistic datasets (hundreds of millions of facts), including the complete INSEE (French) and Eurostat (European Union) datasets. More information about StatCheckis available online at: https://team.inria.fr/cedar/projects/statcheck/.
KW - data warehouses
KW - fact-checking
KW - multidimensional data
KW - natural language processing
U2 - 10.1145/3511808.3557198
DO - 10.1145/3511808.3557198
M3 - Conference contribution
AN - SCOPUS:85140832423
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 4798
EP - 4802
BT - CIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 17 October 2022 through 21 October 2022
ER -