Passer à la navigation principale Passer à la recherche Passer au contenu principal

ON BITS AND BANDITS: QUANTIFYING THE REGRET-INFORMATION TRADE-OFF

  • Itai Shufaro
  • , Nadav Merlis
  • , Nir Weinberger
  • , Shie Mannor

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

In many sequential decision problems, an agent performs a repeated task. He then suffers regret and obtains information that he may use in the following rounds. However, sometimes the agent may also obtain information and avoid suffering regret by querying external sources. We study the trade-off between the information an agent accumulates and the regret it suffers. We invoke information-theoretic methods for obtaining regret lower bounds, that also allow us to easily re-derive several known lower bounds. We introduce the first Bayesian regret lower bounds that depend on the information an agent accumulates. We also prove regret upper bounds using the amount of information the agent accumulates. These bounds show that information measured in bits, can be traded off for regret, measured in reward. Finally, we demonstrate the utility of these bounds in improving the performance of a question-answering task with large language models, allowing us to obtain valuable insights.

langue originaleAnglais
titre13th International Conference on Learning Representations, ICLR 2025
EditeurInternational Conference on Learning Representations, ICLR
Pages987-1011
Nombre de pages25
ISBN (Electronique)9798331320850
étatPublié - 1 janv. 2025
Modification externeOui
Evénement13th International Conference on Learning Representations, ICLR 2025 - Singapore, Singapour
Durée: 24 avr. 202528 avr. 2025

Série de publications

Nom13th International Conference on Learning Representations, ICLR 2025

Une conférence

Une conférence13th International Conference on Learning Representations, ICLR 2025
Pays/TerritoireSingapour
La villeSingapore
période24/04/2528/04/25

Empreinte digitale

Examiner les sujets de recherche de « ON BITS AND BANDITS: QUANTIFYING THE REGRET-INFORMATION TRADE-OFF ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation