CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies

  • Mohamed Alami Chehboune
  • , Rim Kaddah
  • , Luca Martino
  • , Fernando Llorente
  • , Jesse Read

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Reinforcement Learning has drawn huge interest as a tool for solving optimal control problems. Solving a given problem (task or environment) involves converging towards an optimal policy. However, there might exist multiple optimal policies that can dramatically differ in their behaviour; for example, some may be faster than the others but at the expense of greater risk. We consider and study a distribution of optimal policies. We design a curiosity-augmented Metropolis algorithm (CAMEO), such that we can sample optimal policies, and such that these policies effectively adopt diverse behaviours, since this implies greater coverage of the different possible optimal policies. In experimental simulations we show that CAMEO indeed obtains policies that all solve classic control problems, and even in the challenging case of environments that provide sparse rewards. We further show that the different policies we sample present different risk profiles, corresponding to interesting practical applications in interpretability, and represents a first step towards learning the distribution of optimal policies itself.

Original languageEnglish
Title of host publication30th European Signal Processing Conference, EUSIPCO 2022 - Proceedings
PublisherEuropean Signal Processing Conference, EUSIPCO
Pages1482-1486
Number of pages5
ISBN (Electronic)9789082797091
Publication statusPublished - 1 Jan 2022
Event30th European Signal Processing Conference, EUSIPCO 2022 - Belgrade, Serbia
Duration: 29 Aug 20222 Sept 2022

Publication series

NameEuropean Signal Processing Conference
Volume2022-August
ISSN (Print)2219-5491

Conference

Conference30th European Signal Processing Conference, EUSIPCO 2022
Country/TerritorySerbia
CityBelgrade
Period29/08/222/09/22

Keywords

  • Curiosity model
  • MCMC
  • Metropolis
  • Reinforcement Learning

Fingerprint

Dive into the research topics of 'CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies'. Together they form a unique fingerprint.

Cite this