TY - GEN
T1 - Reconfidencing LLMs from the Grouping Loss Perspective
AU - Chen, Lihu
AU - Perez-Lebel, Alexandre
AU - Suchanek, Fabian M.
AU - Varoquaux, Gaël
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - Large Language Models (LLMs), such as GPT and LLaMA, are susceptible to generating hallucinated answers in a confident tone. While previous efforts to elicit and calibrate confidence scores have shown some success, they often overlook biases towards certain groups, such as specific nationalities. Existing calibration methods typically focus on average performance, failing to address this disparity. In our study, we demonstrate that the concept of grouping loss is an effective metric for understanding and correcting the heterogeneity in confidence levels. We introduce a novel evaluation dataset, derived from a knowledge base, specifically designed to assess the confidence scores of LLM responses across different groups. Our experimental results highlight significant variations in confidence, which are accurately captured by grouping loss. To tackle this issue, we propose a new method to calibrate the confidence scores of LLMs by considering different groups, a process we term reconfidencing. Our findings indicate that this approach effectively mitigates biases against minority groups, contributing to the development of fairer LLMs. The code is available at https://github.com/tigerchen52/reconfidencing_llms.
AB - Large Language Models (LLMs), such as GPT and LLaMA, are susceptible to generating hallucinated answers in a confident tone. While previous efforts to elicit and calibrate confidence scores have shown some success, they often overlook biases towards certain groups, such as specific nationalities. Existing calibration methods typically focus on average performance, failing to address this disparity. In our study, we demonstrate that the concept of grouping loss is an effective metric for understanding and correcting the heterogeneity in confidence levels. We introduce a novel evaluation dataset, derived from a knowledge base, specifically designed to assess the confidence scores of LLM responses across different groups. Our experimental results highlight significant variations in confidence, which are accurately captured by grouping loss. To tackle this issue, we propose a new method to calibrate the confidence scores of LLMs by considering different groups, a process we term reconfidencing. Our findings indicate that this approach effectively mitigates biases against minority groups, contributing to the development of fairer LLMs. The code is available at https://github.com/tigerchen52/reconfidencing_llms.
UR - https://www.scopus.com/pages/publications/85217271333
U2 - 10.18653/v1/2024.findings-emnlp.85
DO - 10.18653/v1/2024.findings-emnlp.85
M3 - Conference contribution
AN - SCOPUS:85217271333
T3 - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024
SP - 1567
EP - 1581
BT - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024
A2 - Al-Onaizan, Yaser
A2 - Bansal, Mohit
A2 - Chen, Yun-Nung
PB - Association for Computational Linguistics (ACL)
T2 - 2024 Findings of the Association for Computational Linguistics, EMNLP 2024
Y2 - 12 November 2024 through 16 November 2024
ER -