TY - GEN
T1 - Neural network fusion of color, depth and location for object instance recognition on a mobile robot
AU - Caron, Louis Charles
AU - Filliat, David
AU - Gepperth, Alexander
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015/1/1
Y1 - 2015/1/1
N2 - The development of mobile robots for domestic assistance requires solving problems integrating ideas from different fields of research like computer vision, robotic manipulation, localization and mapping. Semantic mapping, that is, the enrichment a map with highlevel information like room and object identities, is an example of such a complex robotic task. Solving this task requires taking into account hard software and hardware constraints brought by the context of autonomous mobile robots, where short processing times and low energy consumption are mandatory. We present a light-weight scene segmentation and object instance recognition algorithm using an RGB-D camera and demonstrate it in a semantic mapping experiment. Our method uses a feed-forward neural network to fuse texture, color and depth information. Running at 3Hz on a single laptop computer, our algorithm achieves a recognition rate of 97% in a controlled environment, and 87% in the adversarial conditions of a real robotic task. Our results demonstrate that state of the art recognition rates on a database does not guarantee performance in a real world experiment. We also show the benefit in these conditions of fusing several recognition decisions and data from different sources. The database we compiled for the purpose of this study is publicly available.
AB - The development of mobile robots for domestic assistance requires solving problems integrating ideas from different fields of research like computer vision, robotic manipulation, localization and mapping. Semantic mapping, that is, the enrichment a map with highlevel information like room and object identities, is an example of such a complex robotic task. Solving this task requires taking into account hard software and hardware constraints brought by the context of autonomous mobile robots, where short processing times and low energy consumption are mandatory. We present a light-weight scene segmentation and object instance recognition algorithm using an RGB-D camera and demonstrate it in a semantic mapping experiment. Our method uses a feed-forward neural network to fuse texture, color and depth information. Running at 3Hz on a single laptop computer, our algorithm achieves a recognition rate of 97% in a controlled environment, and 87% in the adversarial conditions of a real robotic task. Our results demonstrate that state of the art recognition rates on a database does not guarantee performance in a real world experiment. We also show the benefit in these conditions of fusing several recognition decisions and data from different sources. The database we compiled for the purpose of this study is publicly available.
KW - Indoor scene understanding
KW - Instance recognition
KW - Mobile robotics
KW - RGB-D camera
KW - Semantic mapping
U2 - 10.1007/978-3-319-16199-0_55
DO - 10.1007/978-3-319-16199-0_55
M3 - Conference contribution
AN - SCOPUS:84958536473
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 791
EP - 805
BT - Computer Vision - ECCV 2014 Workshops, Proceedings
A2 - Rother, Carsten
A2 - Agapito, Lourdes
A2 - Bronstein, Michael M.
PB - Springer Verlag
T2 - 13th European Conference on Computer Vision, ECCV 2014
Y2 - 6 September 2014 through 12 September 2014
ER -