Passer à la navigation principale Passer à la recherche Passer au contenu principal

Dynamic Adjustment of Reward Function for Proximal Policy Optimization with Imitation Learning: Application to Automated Parking Systems

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

Automated Parking Systems (APS) are responsible for performing a parking maneuver in a secure and time-efficient full autonomy.These systems include mainly three methods; parking spot exploration, path planning, and path tracking. In the literature, there are several path planning and tracking methods where the application of reinforcement learning is widespread. However, performance tuning and ensuring efficiency remains a significant open problem. Moreover, these methods suffer from a non-linearity issue of vehicle dynamics, that causes a deviation from the original route, and do not respect the BS ISO 16787-2017 standard that outlines the minimum requirements needed in APS. To overcome these limitations, our contribution in this paper, named DPPO-IL, is fourfold: (i) A new framework using the Proximal Policy optimization algorithm, allowing agent to explore an empty parking spot, plan then park a car in a random parking spot by avoiding static and dynamic obstacles; (ii) A dynamic adjustment of the reward function using intrinsic reward signals to induce the agent to explore more; (iii) An approach to learn policies from expert demonstrations using imitation learning combined with deep reinforcement learning to speed up the learning phase and reduce the training time; (iv) A task-specific curriculum learning to train the agent in a very complex environment. Experiments show promising results, especially that our approach managed to achieve a 90% success rate where 97% of them were aligned with the parking spot, with an inclination angle greater than ±0.2° and a deviation less than 0.1 meter. These results exceeded the state of the art while respecting the ISO 16787-2017 standard.

langue originaleAnglais
titre2022 IEEE Intelligent Vehicles Symposium, IV 2022
EditeurInstitute of Electrical and Electronics Engineers Inc.
Pages1400-1408
Nombre de pages9
ISBN (Electronique)9781665488211
Les DOIs
étatPublié - 1 janv. 2022
Evénement2022 IEEE Intelligent Vehicles Symposium, IV 2022 - Aachen, Allemagne
Durée: 5 juin 20229 juin 2022

Série de publications

NomIEEE Intelligent Vehicles Symposium, Proceedings
Volume2022-June

Une conférence

Une conférence2022 IEEE Intelligent Vehicles Symposium, IV 2022
Pays/TerritoireAllemagne
La villeAachen
période5/06/229/06/22

Empreinte digitale

Examiner les sujets de recherche de « Dynamic Adjustment of Reward Function for Proximal Policy Optimization with Imitation Learning: Application to Automated Parking Systems ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation