Skip to main navigation Skip to search Skip to main content

Leaky PPO: A Simple and Efficient RL Algorithm for Autonomous Vehicles

  • CNRS UMR 5157 SAMOVAR

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Interest in applying Reinforcement Learning (RL) to Autonomous Vehicles (AVs) is experiencing a rapid and substantial expansion. Proximal Policy Optimization (PPO), a well-known RL algorithm with two versions, is simple to implement and has a high level of generality. In this paper, we first analyze the issues in each of the original PPO versions: asymmetric penalty in the Adaptive KL Penalty Coefficient PPO version, gradient loss and pessimistic estimate in the Clipped PPO version. Therefore, we propose three improved PPO algorithms: Adaptive JS Penalty Coefficient PPO, Leaky PPO, and Parametric PPO. To validate the effectiveness of the proposed algorithm, we generated three autonomous driving scenarios in the Metadrive simulator. Experimental results demonstrate that Leaky PPO outperforms the other five PPO variant algorithms in various autonomous driving simulation scenarios. Furthermore, we demonstrate that the Leaky PPO outperforms other popular RL algorithms and achieves state-of-the-art performance.

Original languageEnglish
Title of host publication2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350359312
DOIs
Publication statusPublished - 1 Jan 2024
Event2024 International Joint Conference on Neural Networks, IJCNN 2024 - Yokohama, Japan
Duration: 30 Jun 20245 Jul 2024

Publication series

NameProceedings of the International Joint Conference on Neural Networks

Conference

Conference2024 International Joint Conference on Neural Networks, IJCNN 2024
Country/TerritoryJapan
CityYokohama
Period30/06/245/07/24

Keywords

  • Autonomous Vehicles
  • Proximal Policy Opti-mization
  • Reinforcement Learning

Fingerprint

Dive into the research topics of 'Leaky PPO: A Simple and Efficient RL Algorithm for Autonomous Vehicles'. Together they form a unique fingerprint.

Cite this