TY - GEN
T1 - Cluster-based data oriented hashing
AU - Chafik, Sanaa
AU - Daoudi, Imane
AU - El Yacoubi, Mounim A.
AU - El Ouardi, Hamid
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/12/2
Y1 - 2015/12/2
N2 - Many multidimensional hashing schemes have been actively studied in recent years, providing efficient nearest neighbor search. Generally, we can distinguish several hashing families, such as learning based hashing, which provides better hash function selectivity by learning the dataset distribution. The spacial hashing family proposes a suitable partition of the multidimensional space, more adapted to data points distribution. In spite of the efficiency of multidimensional hashing techniques to solve the nearest neighbor search problem, these techniques suffer from scalabity issues. In this paper, we propose a novel hashing algorithm, named Cluster Based Data Oriented Hashing, that combines space hashing and learning based hashing techniques. The proposed approach applies first a clustering algorithm for structuring the multidimensional space into clusters. Then, in each cluster, a learning based hashing algorithm is applied by selecting an appropriate hash function that fits the data distribution. Experimental comparaisons with standard Euclidean Locality Sensitive Hashing demonstrate the effectiveness of the proposed method for large datasets.
AB - Many multidimensional hashing schemes have been actively studied in recent years, providing efficient nearest neighbor search. Generally, we can distinguish several hashing families, such as learning based hashing, which provides better hash function selectivity by learning the dataset distribution. The spacial hashing family proposes a suitable partition of the multidimensional space, more adapted to data points distribution. In spite of the efficiency of multidimensional hashing techniques to solve the nearest neighbor search problem, these techniques suffer from scalabity issues. In this paper, we propose a novel hashing algorithm, named Cluster Based Data Oriented Hashing, that combines space hashing and learning based hashing techniques. The proposed approach applies first a clustering algorithm for structuring the multidimensional space into clusters. Then, in each cluster, a learning based hashing algorithm is applied by selecting an appropriate hash function that fits the data distribution. Experimental comparaisons with standard Euclidean Locality Sensitive Hashing demonstrate the effectiveness of the proposed method for large datasets.
U2 - 10.1109/DSAA.2015.7344895
DO - 10.1109/DSAA.2015.7344895
M3 - Conference contribution
AN - SCOPUS:84962882029
T3 - Proceedings of the 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015
BT - Proceedings of the 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015
A2 - Gaussier, Eric
A2 - Cao, Longbing
A2 - Gallinari, Patrick
A2 - Kwok, James
A2 - Pasi, Gabriella
A2 - Zaiane, Osmar
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015
Y2 - 19 October 2015 through 21 October 2015
ER -