TY - GEN
T1 - Hunting or waiting? Discovering passenger-finding strategies from a large-scale real-world taxi dataset
AU - Li, Bin
AU - Zhang, Daqing
AU - Sun, Lin
AU - Chen, Chao
AU - Li, Shijian
AU - Qi, Guande
AU - Yang, Qiang
PY - 2011/1/1
Y1 - 2011/1/1
N2 - In modern cities, more and more vehicles, such as taxis, have been equipped with GPS devices for localization and navigation. Gathering and analyzing these large-scale real-world digital traces have provided us an unprecedented opportunity to understand the city dynamics and reveal the hidden social and economic realities. One innovative pervasive application is to provide correct driving strategies to taxi drivers according to time and location. In this paper, we aim to discover both efficient and inefficient passenger-finding strategies from a large-scale taxi GPS dataset, which was collected from 5350 taxis for one year in a large city of China. By representing the passenger-finding strategies in a Time-Location-Strategy feature triplet and constructing a train/test dataset containing both top- and ordinary-performance taxi features, we adopt a powerful feature selection tool, L1-Norm SVM, to select the most salient feature patterns determining the taxi performance. We find that the selected patterns can well interpret the empirical study results derived from raw data analysis and even reveal interesting hidden facts. Moreover, the taxi performance predictor built on the selected features can achieve a prediction accuracy of 85.3% on a new test dataset, and it also outperforms the one based on all the features, which implies that the selected features are indeed the right indicators of the passenger-finding strategies.
AB - In modern cities, more and more vehicles, such as taxis, have been equipped with GPS devices for localization and navigation. Gathering and analyzing these large-scale real-world digital traces have provided us an unprecedented opportunity to understand the city dynamics and reveal the hidden social and economic realities. One innovative pervasive application is to provide correct driving strategies to taxi drivers according to time and location. In this paper, we aim to discover both efficient and inefficient passenger-finding strategies from a large-scale taxi GPS dataset, which was collected from 5350 taxis for one year in a large city of China. By representing the passenger-finding strategies in a Time-Location-Strategy feature triplet and constructing a train/test dataset containing both top- and ordinary-performance taxi features, we adopt a powerful feature selection tool, L1-Norm SVM, to select the most salient feature patterns determining the taxi performance. We find that the selected patterns can well interpret the empirical study results derived from raw data analysis and even reveal interesting hidden facts. Moreover, the taxi performance predictor built on the selected features can achieve a prediction accuracy of 85.3% on a new test dataset, and it also outperforms the one based on all the features, which implies that the selected features are indeed the right indicators of the passenger-finding strategies.
KW - GPS
KW - Large-scale Data
KW - Passenger-Finding Strategy
KW - Reality Mining
KW - Taxi Data Mining
UR - https://www.scopus.com/pages/publications/79958070495
U2 - 10.1109/PERCOMW.2011.5766967
DO - 10.1109/PERCOMW.2011.5766967
M3 - Conference contribution
AN - SCOPUS:79958070495
SN - 9781612849379
T3 - 2011 IEEE International Conference on Pervasive Computing and Communications Workshops, PERCOM Workshops 2011
SP - 63
EP - 68
BT - 2011 IEEE International Conference on Pervasive Computing and Communications Workshops, PERCOM Workshops 2011
PB - IEEE Computer Society
T2 - 9th IEEE International Conference on Pervasive Computing and Communications Workshops, PERCOM Workshops 2011
Y2 - 21 March 2011 through 25 March 2011
ER -