TY - GEN
T1 - Effects of Social Guidance on a Robot Learning Sequences of Policies in Hierarchical Learning
AU - Duminy, Nicolas
AU - Nguyen, Sao Mai
AU - Duhaut, Dominique
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - We aim for a robot capable to learn sequences of motor policies to achieve a field of complex tasks. In this paper, we consider a set of interrelated complex tasks hierarchically organized. To address this high-dimensional mapping between a continuous high-dimensional space of tasks and an infinite dimensional space of sequences of policies, we introduce a framework called 'procedure', which enables the creation of sequences of policies by combining previously learned skills. We propose an active learning algorithmic architecture, capable of organizing its learning process in order to achieve a field of complex tasks by learning sequences of primitive motor policies. Based on heuristics of goal-babbling, social guidance, strategic learning guided by intrinsic motivation, and the 'procedure' framework, our algorithm can actively decide on which outcome to focus and which exploration strategy to apply. We show that a simulation industrial robot can tackle the learning of complex motor policies and adapt this complexity to that of the task at hand. Owing to its exploration strategies, it can discover the levels of difficulty of the tasks, and learn the hierarchy between tasks so as to combine simple tasks to complete a complex task.
AB - We aim for a robot capable to learn sequences of motor policies to achieve a field of complex tasks. In this paper, we consider a set of interrelated complex tasks hierarchically organized. To address this high-dimensional mapping between a continuous high-dimensional space of tasks and an infinite dimensional space of sequences of policies, we introduce a framework called 'procedure', which enables the creation of sequences of policies by combining previously learned skills. We propose an active learning algorithmic architecture, capable of organizing its learning process in order to achieve a field of complex tasks by learning sequences of primitive motor policies. Based on heuristics of goal-babbling, social guidance, strategic learning guided by intrinsic motivation, and the 'procedure' framework, our algorithm can actively decide on which outcome to focus and which exploration strategy to apply. We show that a simulation industrial robot can tackle the learning of complex motor policies and adapt this complexity to that of the task at hand. Owing to its exploration strategies, it can discover the levels of difficulty of the tasks, and learn the hierarchy between tasks so as to combine simple tasks to complete a complex task.
KW - active imitation learning
KW - continual learning
KW - curriculum learning
KW - hierarchical learning
KW - intrinsic motivation
KW - social guidance
U2 - 10.1109/SMC.2018.00636
DO - 10.1109/SMC.2018.00636
M3 - Conference contribution
AN - SCOPUS:85062226083
T3 - Proceedings - 2018 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2018
SP - 3755
EP - 3760
BT - Proceedings - 2018 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2018
Y2 - 7 October 2018 through 10 October 2018
ER -