TY - GEN
T1 - On Multisensor Activation Policies for Bernoulli Tracking
AU - Saucan, Augustin A.
AU - Das, Subhro
AU - Win, Moe Z.
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - In this work, we propose a family of sensor-activation policies coupled with a learning method for information-seeking sensor activation in multisensor Bernoulli tracking applications. Sensor activation, and sensor management more generally, are of great interest in multi-agent networks and Internet of Things (IoT) applications, where limited energy, sensing, and communication resources have to be efficiently allocated for optimal inference. Non-myopic control, resource constraints, and the partial observability of the Bernoulli target are the main challenges addressed in this work. The novelty of our approach is threefold. First, a belief-space Markov decision process (MDP) reformulation is proposed for the Bernoulli tracking and control problem that incorporates uncertainties in both object existence and object state. Second, a parametric family of sensor-activation distributions is proposed as control policies that harness the mutual information between the sensor measurements and the belief state. Thirdly, a novel reward metric is employed to capture the information gain on the Bernoulli belief state of a specific combination of sensor activations. A Bayesian actor-critic (BAC) reinforcement learning (RL) methodology is employed to further refine the policy by maximizing a discounted reward over an infinite horizon and under an imposed activation constraint. Numerical simulations validate our approach and show an improved tracking performance over a uniform sensor-activation method.
AB - In this work, we propose a family of sensor-activation policies coupled with a learning method for information-seeking sensor activation in multisensor Bernoulli tracking applications. Sensor activation, and sensor management more generally, are of great interest in multi-agent networks and Internet of Things (IoT) applications, where limited energy, sensing, and communication resources have to be efficiently allocated for optimal inference. Non-myopic control, resource constraints, and the partial observability of the Bernoulli target are the main challenges addressed in this work. The novelty of our approach is threefold. First, a belief-space Markov decision process (MDP) reformulation is proposed for the Bernoulli tracking and control problem that incorporates uncertainties in both object existence and object state. Second, a parametric family of sensor-activation distributions is proposed as control policies that harness the mutual information between the sensor measurements and the belief state. Thirdly, a novel reward metric is employed to capture the information gain on the Bernoulli belief state of a specific combination of sensor activations. A Bayesian actor-critic (BAC) reinforcement learning (RL) methodology is employed to further refine the policy by maximizing a discounted reward over an infinite horizon and under an imposed activation constraint. Numerical simulations validate our approach and show an improved tracking performance over a uniform sensor-activation method.
U2 - 10.1109/MILCOM52596.2021.9652984
DO - 10.1109/MILCOM52596.2021.9652984
M3 - Conference contribution
AN - SCOPUS:85124151088
T3 - Proceedings - IEEE Military Communications Conference MILCOM
SP - 795
EP - 801
BT - MILCOM 2021 - 2021 IEEE Military Communications Conference
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE Military Communications Conference, MILCOM 2021
Y2 - 29 November 2021 through 2 December 2021
ER -