Dynamic Adjustment of Reward Function for Proximal Policy Optimization with Imitation Learning: Application to Automated Parking Systems

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Automated Parking Systems (APS) are responsible for performing a parking maneuver in a secure and time-efficient full autonomy.These systems include mainly three methods; parking spot exploration, path planning, and path tracking. In the literature, there are several path planning and tracking methods where the application of reinforcement learning is widespread. However, performance tuning and ensuring efficiency remains a significant open problem. Moreover, these methods suffer from a non-linearity issue of vehicle dynamics, that causes a deviation from the original route, and do not respect the BS ISO 16787-2017 standard that outlines the minimum requirements needed in APS. To overcome these limitations, our contribution in this paper, named DPPO-IL, is fourfold: (i) A new framework using the Proximal Policy optimization algorithm, allowing agent to explore an empty parking spot, plan then park a car in a random parking spot by avoiding static and dynamic obstacles; (ii) A dynamic adjustment of the reward function using intrinsic reward signals to induce the agent to explore more; (iii) An approach to learn policies from expert demonstrations using imitation learning combined with deep reinforcement learning to speed up the learning phase and reduce the training time; (iv) A task-specific curriculum learning to train the agent in a very complex environment. Experiments show promising results, especially that our approach managed to achieve a 90% success rate where 97% of them were aligned with the parking spot, with an inclination angle greater than ±0.2° and a deviation less than 0.1 meter. These results exceeded the state of the art while respecting the ISO 16787-2017 standard.

Original languageEnglish
Title of host publication2022 IEEE Intelligent Vehicles Symposium, IV 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1400-1408
Number of pages9
ISBN (Electronic)9781665488211
DOIs
Publication statusPublished - 1 Jan 2022
Event2022 IEEE Intelligent Vehicles Symposium, IV 2022 - Aachen, Germany
Duration: 5 Jun 20229 Jun 2022

Publication series

NameIEEE Intelligent Vehicles Symposium, Proceedings
Volume2022-June

Conference

Conference2022 IEEE Intelligent Vehicles Symposium, IV 2022
Country/TerritoryGermany
CityAachen
Period5/06/229/06/22

Fingerprint

Dive into the research topics of 'Dynamic Adjustment of Reward Function for Proximal Policy Optimization with Imitation Learning: Application to Automated Parking Systems'. Together they form a unique fingerprint.

Cite this