TY - GEN
T1 - Fully end-to-end composite recurrent convolution network for deformable facial tracking in the wild
AU - Aspandi, Decky
AU - Martinez, Oriol
AU - Sukno, Federico
AU - Binefa, Xavier
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5/1
Y1 - 2019/5/1
N2 - Human facial tracking is an important task in computer vision, which has recently lost pace compared to other facial analysis tasks. The majority of current available tracker possess two major limitations: their little use of temporal information and the widespread use of handcrafted features, without taking full advantage of the large annotated datasets that have recently become available. In this paper we present a fully end-to-end facial tracking model based on current state of the art deep model architectures that can be effectively trained from the available annotated facial landmark datasets. We build our model from the recently introduced general object tracker Re3, which allows modeling the short and long temporal dependency between frames by means of its internal Long Short Term Memory (LSTM) layers. Facial tracking experiments on the challenging 300-VW dataset show that our model can produce state of the art accuracy and far lower failure rates than competing approaches. We specifically compare the performance of our approach modified to work in tracking-by-detection mode and showed that, as such, it can produce results that are comparable to state of the art trackers. However, upon activation of our tracking mechanism, the results improve significantly, confirming the advantage of taking into account temporal dependencies.
AB - Human facial tracking is an important task in computer vision, which has recently lost pace compared to other facial analysis tasks. The majority of current available tracker possess two major limitations: their little use of temporal information and the widespread use of handcrafted features, without taking full advantage of the large annotated datasets that have recently become available. In this paper we present a fully end-to-end facial tracking model based on current state of the art deep model architectures that can be effectively trained from the available annotated facial landmark datasets. We build our model from the recently introduced general object tracker Re3, which allows modeling the short and long temporal dependency between frames by means of its internal Long Short Term Memory (LSTM) layers. Facial tracking experiments on the challenging 300-VW dataset show that our model can produce state of the art accuracy and far lower failure rates than competing approaches. We specifically compare the performance of our approach modified to work in tracking-by-detection mode and showed that, as such, it can produce results that are comparable to state of the art trackers. However, upon activation of our tracking mechanism, the results improve significantly, confirming the advantage of taking into account temporal dependencies.
U2 - 10.1109/FG.2019.8756630
DO - 10.1109/FG.2019.8756630
M3 - Conference contribution
AN - SCOPUS:85070445709
T3 - Proceedings - 14th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019
BT - Proceedings - 14th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 14th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019
Y2 - 14 May 2019 through 18 May 2019
ER -