TY - GEN
T1 - Opportunistic spectrum access
T2 - 2008 IEEE Global Telecommunications Conference, GLOBECOM 2008
AU - Alaya-Feki, Afef Ben Hadj
AU - Sayrac, Berna
AU - Moulines, Eric
AU - Lecornec, Alain
PY - 2008/12/1
Y1 - 2008/12/1
N2 - This paper presents an online tuning approach for the ad-hoc reinforcement learning algorithms which are used for solving the exploitation-exploration dilemma of the opportunistic spectrum access, in dynamic environments. These algorithms originate from a well-known problem in computer science: the Multi-Armed Bandit (MAB) problem and they have provided evidence to be viable solutions for the detection and exploration of white spaces in opportunistic spectrum access. Previous work [3] has shown that the reinforcement learning solutions of the MAB problem are very sensitive to the statistical properties of the wireless medium access and therefore need careful tuning according to the dynamic variations of the wireless environment. This paper deals with the online tuning of those algorithms by proposing and assessing two different approaches: 1-a meta learning approach where a second learner (meta learner) is used to learn the parameters of the base learner, and 2-the Exp3 algorithm that has been previously proposed for dynamical tuning of MAB parameters in other contexts. The simulation results obtained on an IEEE802.11 medium access scenario show that one of the proposed meta-learning methods, namely the change point detection method, achieves much better performance compared to the other methods.
AB - This paper presents an online tuning approach for the ad-hoc reinforcement learning algorithms which are used for solving the exploitation-exploration dilemma of the opportunistic spectrum access, in dynamic environments. These algorithms originate from a well-known problem in computer science: the Multi-Armed Bandit (MAB) problem and they have provided evidence to be viable solutions for the detection and exploration of white spaces in opportunistic spectrum access. Previous work [3] has shown that the reinforcement learning solutions of the MAB problem are very sensitive to the statistical properties of the wireless medium access and therefore need careful tuning according to the dynamic variations of the wireless environment. This paper deals with the online tuning of those algorithms by proposing and assessing two different approaches: 1-a meta learning approach where a second learner (meta learner) is used to learn the parameters of the base learner, and 2-the Exp3 algorithm that has been previously proposed for dynamical tuning of MAB parameters in other contexts. The simulation results obtained on an IEEE802.11 medium access scenario show that one of the proposed meta-learning methods, namely the change point detection method, achieves much better performance compared to the other methods.
UR - https://www.scopus.com/pages/publications/67449087817
U2 - 10.1109/GLOCOM.2008.ECP.594
DO - 10.1109/GLOCOM.2008.ECP.594
M3 - Conference contribution
AN - SCOPUS:67449087817
SN - 9781424423248
T3 - GLOBECOM - IEEE Global Telecommunications Conference
SP - 3096
EP - 3100
BT - 2008 IEEE Global Telecommunications Conference, GLOBECOM 2008
Y2 - 30 November 2008 through 4 December 2008
ER -