TY - GEN
T1 - An enhanced reinforcement learning approach for dynamic placement of virtual network functions
AU - Houidi, Omar
AU - Soualah, Oussama
AU - Louati, Wajdi
AU - Zeghlache, Djamal
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/8/1
Y1 - 2020/8/1
N2 - This paper addresses Virtualized Network Function Forwarding Graph (VNF-FG) embedding with the objective of realizing long term reward compared to placement algorithms that aim at instantaneous optimal placement. The long term reward is obtained using Reinforcement Learning (RL), following a Markov Decision Process (MDP) model, enhanced through the injection of expert knowledge in the learning process. A comparison with an Integer Linear Programming (ILP) approach, a reduced candidate set (R-ILP), and an algorithm that treats the requests in batch reveals the potential improvements using the RL approach. The instantaneous and short term reward solutions are efficient only in finding instant solutions as they make decisions only on current infrastructure status for a given request at a time or eventually a batch of requests. They are efficient only for present conditions without anticipating future requests. RL possesses instead the learning and anticipation capabilities lacking in instantaneous and snapshot optimizations. A Reinforcement Learning based approach, called EQL (Enhanced Q-Learning), aiming at balancing the load on hosting infrastructures is proposed to achieve the desired longer term reward. EQL employs RL to learn the network and control it based on the usage patterns of the physical resources. Results from extensive simulations, based on realistic and large scale topologies, report the superior performance of EQL in terms of acceptance rate, quality, scalability and achieved gains.
AB - This paper addresses Virtualized Network Function Forwarding Graph (VNF-FG) embedding with the objective of realizing long term reward compared to placement algorithms that aim at instantaneous optimal placement. The long term reward is obtained using Reinforcement Learning (RL), following a Markov Decision Process (MDP) model, enhanced through the injection of expert knowledge in the learning process. A comparison with an Integer Linear Programming (ILP) approach, a reduced candidate set (R-ILP), and an algorithm that treats the requests in batch reveals the potential improvements using the RL approach. The instantaneous and short term reward solutions are efficient only in finding instant solutions as they make decisions only on current infrastructure status for a given request at a time or eventually a batch of requests. They are efficient only for present conditions without anticipating future requests. RL possesses instead the learning and anticipation capabilities lacking in instantaneous and snapshot optimizations. A Reinforcement Learning based approach, called EQL (Enhanced Q-Learning), aiming at balancing the load on hosting infrastructures is proposed to achieve the desired longer term reward. EQL employs RL to learn the network and control it based on the usage patterns of the physical resources. Results from extensive simulations, based on realistic and large scale topologies, report the superior performance of EQL in terms of acceptance rate, quality, scalability and achieved gains.
KW - Dynamic Service Placement
KW - Network Function Virtualization
KW - Optimization
KW - Reinforcement Learning
U2 - 10.1109/PIMRC48278.2020.9217250
DO - 10.1109/PIMRC48278.2020.9217250
M3 - Conference contribution
AN - SCOPUS:85094141033
T3 - IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC
BT - 2020 IEEE 31st Annual International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 31st IEEE Annual International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2020
Y2 - 31 August 2020 through 3 September 2020
ER -