Enabling Markovian Representations under Imperfect Information

Research output: Contribution to journalConference articlepeer-review

Abstract

Markovian systems are widely used in reinforcement learning (RL), when the successful completion of a task depends exclusively on the last interaction between an autonomous agent and its environment. Unfortunately, real-world instructions are typically complex and often better described as non-Markovian. In this paper we present an extension method that allows solving partially-observable non-Markovian reward decision processes (PONMRDPs) by solving equivalent Markovian models. This potentially facilitates Markovian-based state-of-the-art techniques, including RL, to find optimal behaviours for problems best described as PON-MRDP. We provide formal optimality guarantees of our extension methods together with a counterexample illustrating that naive extensions from existing techniques in fully-observable environments cannot provide such guarantees.

Original languageEnglish
Pages (from-to)450-457
Number of pages8
JournalInternational Conference on Agents and Artificial Intelligence
Volume2
DOIs
Publication statusPublished - 1 Jan 2022
Event14th International Conference on Agents and Artificial Intelligence , ICAART 2022 - Virtual, Online
Duration: 3 Feb 20225 Feb 2022

Keywords

  • Extended Partially Observable Decision Process
  • Markov Decision Processes
  • Partial Observability
  • non-Markovian Rewards

Fingerprint

Dive into the research topics of 'Enabling Markovian Representations under Imperfect Information'. Together they form a unique fingerprint.

Cite this