TY - GEN
T1 - Temporal encoded F-formation system for social interaction detection
AU - Gan, Tian
AU - Wong, Yongkang
AU - Zhang, Daqing
AU - Kankanhalli, Mohan S.
PY - 2013/11/18
Y1 - 2013/11/18
N2 - In the context of a social gathering, such as a cocktail party, the memorable moments are generally captured by professional photographers or by the participants. The latter case is often undesirable because many participants would rather enjoy the event instead of being occupied by the photo- Taking task. Motivated by this scenario, we propose the use of a set of cameras to automatically take photos. Instead of performing dense analysis on all cameras for photo captur- ing, we first detect the occurrence and location of social in- Teractions via F-formation detection. In the sociology liter- Ature, F-formation is a concept used to define social interac- Tions, where each detection only requires the spatial location and orientation of each participant. This information can be robustly obtained with additional Kinect depth sensors. In this paper, we propose an extended F-formation system for robust detection of interactions and interactants. The ex- Tended F-formation system employs a heat-map based fea- Ture representation for each individual, namely Interaction Space (IS), to model their location, orientation, and tem- poral information. Using the temporally encoded IS for each detected interactant, we propose a best-view camera selection framework to detect the corresponding best view camera for each detected social interaction. The extended F-formation system is evaluated with synthetic data on mul- Tiple scenarios. To demonstrate the effectiveness of the pro- posed system, we conducted a user study to compare our best view camera ranking with human's ranking using real- world data.
AB - In the context of a social gathering, such as a cocktail party, the memorable moments are generally captured by professional photographers or by the participants. The latter case is often undesirable because many participants would rather enjoy the event instead of being occupied by the photo- Taking task. Motivated by this scenario, we propose the use of a set of cameras to automatically take photos. Instead of performing dense analysis on all cameras for photo captur- ing, we first detect the occurrence and location of social in- Teractions via F-formation detection. In the sociology liter- Ature, F-formation is a concept used to define social interac- Tions, where each detection only requires the spatial location and orientation of each participant. This information can be robustly obtained with additional Kinect depth sensors. In this paper, we propose an extended F-formation system for robust detection of interactions and interactants. The ex- Tended F-formation system employs a heat-map based fea- Ture representation for each individual, namely Interaction Space (IS), to model their location, orientation, and tem- poral information. Using the temporally encoded IS for each detected interactant, we propose a best-view camera selection framework to detect the corresponding best view camera for each detected social interaction. The extended F-formation system is evaluated with synthetic data on mul- Tiple scenarios. To demonstrate the effectiveness of the pro- posed system, we conducted a user study to compare our best view camera ranking with human's ranking using real- world data.
KW - Behaviour modeling
KW - F-formation
KW - Social computing
KW - Social interaction
KW - Video analytics
U2 - 10.1145/2502081.2502096
DO - 10.1145/2502081.2502096
M3 - Conference contribution
AN - SCOPUS:84887483032
SN - 9781450324045
T3 - MM 2013 - Proceedings of the 2013 ACM Multimedia Conference
SP - 937
EP - 946
BT - MM 2013 - Proceedings of the 2013 ACM Multimedia Conference
T2 - 21st ACM International Conference on Multimedia, MM 2013
Y2 - 21 October 2013 through 25 October 2013
ER -