Passer à la navigation principale Passer à la recherche Passer au contenu principal

Leaky PPO: A Simple and Efficient RL Algorithm for Autonomous Vehicles

  • CNRS UMR 5157 SAMOVAR

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

Interest in applying Reinforcement Learning (RL) to Autonomous Vehicles (AVs) is experiencing a rapid and substantial expansion. Proximal Policy Optimization (PPO), a well-known RL algorithm with two versions, is simple to implement and has a high level of generality. In this paper, we first analyze the issues in each of the original PPO versions: asymmetric penalty in the Adaptive KL Penalty Coefficient PPO version, gradient loss and pessimistic estimate in the Clipped PPO version. Therefore, we propose three improved PPO algorithms: Adaptive JS Penalty Coefficient PPO, Leaky PPO, and Parametric PPO. To validate the effectiveness of the proposed algorithm, we generated three autonomous driving scenarios in the Metadrive simulator. Experimental results demonstrate that Leaky PPO outperforms the other five PPO variant algorithms in various autonomous driving simulation scenarios. Furthermore, we demonstrate that the Leaky PPO outperforms other popular RL algorithms and achieves state-of-the-art performance.

langue originaleAnglais
titre2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings
EditeurInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronique)9798350359312
Les DOIs
étatPublié - 1 janv. 2024
Evénement2024 International Joint Conference on Neural Networks, IJCNN 2024 - Yokohama, Japon
Durée: 30 juin 20245 juil. 2024

Série de publications

NomProceedings of the International Joint Conference on Neural Networks

Une conférence

Une conférence2024 International Joint Conference on Neural Networks, IJCNN 2024
Pays/TerritoireJapon
La villeYokohama
période30/06/245/07/24

Empreinte digitale

Examiner les sujets de recherche de « Leaky PPO: A Simple and Efficient RL Algorithm for Autonomous Vehicles ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation