TY - GEN
T1 - Learning a set of interrelated tasks by using sequences of motor policies for a strategic intrinsically motivated learner
AU - Duminy, Nicolas
AU - Nguyen, Sao Mai
AU - Duhaut, Dominique
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/4/2
Y1 - 2018/4/2
N2 - We propose an active learning architecture for robots, capable of organizing its learning process to achieve a field of complex tasks by learning sequences of motor policies, called Intrinsically Motivated Procedure Babbling (IM-PB). The learner can generalize over its experience to continuously learn new tasks. It chooses actively what and how to learn based by empirical measures of its own progress. In this paper, we are considering the learning of a set of interrelated tasks outcomes hierarchically organized. We introduce a framework called "procedures', which are sequences of policies defined by the combination of previously learned skills. Our algorithmic architecture uses the procedures to autonomously discover how to combine simple skills to achieve complex goals. It actively chooses between 2 strategies of goal-directed exploration: exploration of the policy space or the procedural space. We show on a simulated environment that our new architecture is capable of tackling the learning of complex motor policies, to adapt the complexity of its policies to the task at hand. We also show that our "procedures" framework helps the learner to tackle difficult hierarchical tasks.
AB - We propose an active learning architecture for robots, capable of organizing its learning process to achieve a field of complex tasks by learning sequences of motor policies, called Intrinsically Motivated Procedure Babbling (IM-PB). The learner can generalize over its experience to continuously learn new tasks. It chooses actively what and how to learn based by empirical measures of its own progress. In this paper, we are considering the learning of a set of interrelated tasks outcomes hierarchically organized. We introduce a framework called "procedures', which are sequences of policies defined by the combination of previously learned skills. Our algorithmic architecture uses the procedures to autonomously discover how to combine simple skills to achieve complex goals. It actively chooses between 2 strategies of goal-directed exploration: exploration of the policy space or the procedural space. We show on a simulated environment that our new architecture is capable of tackling the learning of complex motor policies, to adapt the complexity of its policies to the task at hand. We also show that our "procedures" framework helps the learner to tackle difficult hierarchical tasks.
KW - Developmental robotics
KW - Intrinsic motivation
KW - Learning complex policies
KW - Strategical learning
UR - https://www.scopus.com/pages/publications/85049611372
U2 - 10.1109/IRC.2018.00061
DO - 10.1109/IRC.2018.00061
M3 - Conference contribution
AN - SCOPUS:85049611372
T3 - Proceedings - 2nd IEEE International Conference on Robotic Computing, IRC 2018
SP - 288
EP - 291
BT - Proceedings - 2nd IEEE International Conference on Robotic Computing, IRC 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd IEEE International Conference on Robotic Computing, IRC 2018
Y2 - 31 January 2018 through 2 February 2018
ER -