Résumé
Markovian systems are widely used in reinforcement learning (RL), when the successful completion of a task depends exclusively on the last interaction between an autonomous agent and its environment. Unfortunately, real-world instructions are typically complex and often better described as non-Markovian. In this paper we present an extension method that allows solving partially-observable non-Markovian reward decision processes (PONMRDPs) by solving equivalent Markovian models. This potentially facilitates Markovian-based state-of-the-art techniques, including RL, to find optimal behaviours for problems best described as PON-MRDP. We provide formal optimality guarantees of our extension methods together with a counterexample illustrating that naive extensions from existing techniques in fully-observable environments cannot provide such guarantees.
| langue originale | Anglais |
|---|---|
| Pages (de - à) | 450-457 |
| Nombre de pages | 8 |
| journal | International Conference on Agents and Artificial Intelligence |
| Volume | 2 |
| Les DOIs | |
| état | Publié - 1 janv. 2022 |
| Evénement | 14th International Conference on Agents and Artificial Intelligence , ICAART 2022 - Virtual, Online Durée: 3 févr. 2022 → 5 févr. 2022 |
Empreinte digitale
Examiner les sujets de recherche de « Enabling Markovian Representations under Imperfect Information ». Ensemble, ils forment une empreinte digitale unique.Contient cette citation
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver