Passer à la navigation principale Passer à la recherche Passer au contenu principal

Verifying the Steps of Deductive Reasoning Chains

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

As Large Language Models penetrate everyday life more and more, it becomes essential to measure the correctness of their output. In this paper, we propose a novel task: the automatic verification of individual reasoning steps in a logical deductive Chain-of-Thought. This task addresses two well-known problems of LLMs, hallucination and incorrect reasoning. We propose a new dataset of logical reasoning chains, in which the individual deduction steps have been manually annotated for soundness, and benchmark several methods on it. We find that LLMs can detect unsound reasoning steps fairly well, but argue that verification has to be performed by transparent methods instead. We test symbolic methods, but find that they under-perform. We develop a neuro-symbolic baseline called VANESSA that comes closer to the performance of LLMs.

langue originaleAnglais
titreFindings of the Association for Computational Linguistics
Sous-titreACL 2025
rédacteurs en chefWanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
EditeurAssociation for Computational Linguistics (ACL)
Pages456-475
Nombre de pages20
ISBN (Electronique)9798891762565
Les DOIs
étatPublié - 1 janv. 2025
Evénement63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Vienna, Autriche
Durée: 27 juil. 20251 août 2025

Série de publications

NomProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (imprimé)0736-587X

Une conférence

Une conférence63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Pays/TerritoireAutriche
La villeVienna
période27/07/251/08/25

Empreinte digitale

Examiner les sujets de recherche de « Verifying the Steps of Deductive Reasoning Chains ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation