TY - GEN
T1 - Progressive State Space Disaggregation for Infinite Horizon Dynamic Programming
AU - Forghieri, Orso
AU - Castel, Hind
AU - Hyon, Emmanuel
AU - Le Pennec, Erwan
N1 - Publisher Copyright:
Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2024/5/30
Y1 - 2024/5/30
N2 - High dimensionality of model-based Reinforcement Learning and Markov Decision Processes can be reduced using abstractions of the state and action spaces. Although hierarchical learning and state abstraction methods have been explored over the past decades, explicit methods to build useful abstractions of models are rarely provided. In this work, we provide a new state abstraction method for solving infinite horizon problems in the discounted and total settings. Our approach is to progressively disaggregate abstract regions by iteratively slicing aggregations of states relatively to a value function. The distinguishing feature of our method, in contrast to previous approximations of the Bellman operator, is the disaggregation of regions during value function iterations (or policy evaluation steps). The objective is to find a more efficient aggregation that reduces the error on each piece of the partition. We provide a proof of convergence for this algorithm without making any assumptions about the structure of the problem. We also show that this process decreases the computational complexity of the Bellman operator iteration and provides useful abstractions. We then plug this state space disaggregation process in classical Dynamic Programming algorithms, namely Approximate Value Iteration, Q-Value Iteration and Policy Iteration. Finally, we conduct a numerical comparison, which shows that our algorithm is faster than both traditional dynamic programming approach and recent aggregative methods that use a fixed number of partitions.
AB - High dimensionality of model-based Reinforcement Learning and Markov Decision Processes can be reduced using abstractions of the state and action spaces. Although hierarchical learning and state abstraction methods have been explored over the past decades, explicit methods to build useful abstractions of models are rarely provided. In this work, we provide a new state abstraction method for solving infinite horizon problems in the discounted and total settings. Our approach is to progressively disaggregate abstract regions by iteratively slicing aggregations of states relatively to a value function. The distinguishing feature of our method, in contrast to previous approximations of the Bellman operator, is the disaggregation of regions during value function iterations (or policy evaluation steps). The objective is to find a more efficient aggregation that reduces the error on each piece of the partition. We provide a proof of convergence for this algorithm without making any assumptions about the structure of the problem. We also show that this process decreases the computational complexity of the Bellman operator iteration and provides useful abstractions. We then plug this state space disaggregation process in classical Dynamic Programming algorithms, namely Approximate Value Iteration, Q-Value Iteration and Policy Iteration. Finally, we conduct a numerical comparison, which shows that our algorithm is faster than both traditional dynamic programming approach and recent aggregative methods that use a fixed number of partitions.
UR - https://www.scopus.com/pages/publications/85195929792
U2 - 10.1609/icaps.v34i1.31479
DO - 10.1609/icaps.v34i1.31479
M3 - Conference contribution
AN - SCOPUS:85195929792
T3 - Proceedings International Conference on Automated Planning and Scheduling, ICAPS
SP - 221
EP - 229
BT - Proceedings of the 34th International Conference on Automated Planning and Scheduling, ICAPS 2024
A2 - Bernardini, Sara
A2 - Muise, Christian
PB - Association for the Advancement of Artificial Intelligence
T2 - 34th International Conference on Automated Planning and Scheduling, ICAPS 2024
Y2 - 1 June 2024 through 6 June 2024
ER -