TY - GEN
T1 - Distributed knowledge discovery with non linear dimensionality reduction
AU - Magdalinos, Panagis
AU - Vazirgiannis, Michalis
AU - Valsamou, Dialecti
PY - 2010/12/1
Y1 - 2010/12/1
N2 - Data mining tasks results are usually improved by reducing the dimensionality of data. This improvement however is achieved harder in the case that data lay on a non linear manifold and are distributed across network nodes. Although numerous algorithms for distributed imensionality reduction have been proposed, all assume that data reside in a linear space. In order to address the non-linear case, we introduce D-Isomap, a novel distributed non linear dimensionality reduction algorithm, particularly applicable in large scale, structured peer-to-peer networks. Apart from unfolding a non linear manifold, our algorithm is capable of approximate reconstruction of the global dataset at peer level a very attractive feature for distributed data mining problems. We extensively evaluate its performance through experiments on both artificial and real world datasets. The obtained results show the suitability and viability of our approach for knowledge discovery in distributed environments.
AB - Data mining tasks results are usually improved by reducing the dimensionality of data. This improvement however is achieved harder in the case that data lay on a non linear manifold and are distributed across network nodes. Although numerous algorithms for distributed imensionality reduction have been proposed, all assume that data reside in a linear space. In order to address the non-linear case, we introduce D-Isomap, a novel distributed non linear dimensionality reduction algorithm, particularly applicable in large scale, structured peer-to-peer networks. Apart from unfolding a non linear manifold, our algorithm is capable of approximate reconstruction of the global dataset at peer level a very attractive feature for distributed data mining problems. We extensively evaluate its performance through experiments on both artificial and real world datasets. The obtained results show the suitability and viability of our approach for knowledge discovery in distributed environments.
KW - Distributed data mining
KW - Distributed non linear dimensionality reduction
U2 - 10.1007/978-3-642-13672-6_2
DO - 10.1007/978-3-642-13672-6_2
M3 - Conference contribution
AN - SCOPUS:79956324642
SN - 3642136710
SN - 9783642136719
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 14
EP - 26
BT - Advances in Knowledge Discovery and Data Mining - 14th Pacific-Asia Conference, PAKDD 2010, Proceedings
T2 - 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2010
Y2 - 21 June 2010 through 24 June 2010
ER -