Deep Reinforcement Learning for Scheduling Uplink IoT Traffic with Strict Deadlines

Research output: Contribution to journalConference articlepeer-review

Abstract

This paper considers the Multiple Access problem where N Internet of Things (IoT) devices share a common wireless medium towards a central Base Station (BS). We propose a Reinforcement Learning (RL) method where the BS is the agent and the devices are part of the environment. A device is allowed to transmit only when the BS decides to schedule it. Besides the information packets, devices send additional messages like the delay or the number of discarded packets since their last transmission. This information is used to design the RL reward function and constitutes the next observation that the agent can use to schedule the next device. Leveraging RL allows us to learn the sporadic and heterogeneous traffic patterns of the IoT devices and an optimal scheduling policy that maximizes the channel throughput. We adapt the Proximal Policy Optimization (PPO) algorithm with a Recurrent Neural Network (RNN) to handle the partial observability of our problem and exploit the temporal correlations of the users' traffic. We demonstrate the performance of our model through simulations on different number of heterogeneous devices with periodic traffic and individual latency constraints. We show that our RL algorithm outperforms traditional scheduling schemes and distributed medium access algorithms.

Original languageEnglish
JournalProceedings - IEEE Global Communications Conference, GLOBECOM
DOIs
Publication statusPublished - 1 Jan 2021
Event2021 IEEE Global Communications Conference, GLOBECOM 2021 - Madrid, Spain
Duration: 7 Dec 202111 Dec 2021

Keywords

  • Internet of Things
  • Multiple Access
  • POMDP
  • Proximal Policy Optimization
  • Reinforcement Learning
  • Wireless sensor networks
  • scheduling

Fingerprint

Dive into the research topics of 'Deep Reinforcement Learning for Scheduling Uplink IoT Traffic with Strict Deadlines'. Together they form a unique fingerprint.

Cite this