Skip to main navigation Skip to search Skip to main content

Quality scheme assessment in the clustering process

  • Athens Univ. of Econ. and Business

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Clustering is mostly an unsupervised procedure and most of the clustering algorithms depend on assumptions and initial guesses in order to define the subgroups presented in a data set. As a consequence, in most applications the final clusters require some sort of evaluation. The evaluation procedure has to tackle difficult problems, which can be qualitatively expressed as: i. quality of clusters, ii. the degree with which a clustering scheme fits a specific data set, iii. the optimal number of clusters in a partitioning. In this paper we present a scheme for finding the optimal partitioning of a data set during the clustering process regardless of the clustering algorithm used. More specifically, we present an approach for evaluation of clustering schemes (partitions) so as to find the best number of clusters, which occurs in a specific data set. A clustering algorithm produces different partitions for different values of the input parameters. The proposed approach selects the best clustering scheme (i.e., the scheme with the most compact and well-separated clusters), according to a quality index we define. We verified our approach using two popular clustering algorithms on synthetic and real data sets in order to evaluate its reliability. Moreover, we study the influence of different clustering parameters to the proposed quality index.

Original languageEnglish
Title of host publicationPrinciples of Data Mining and Knowledge Discovery - 4th European Conference, PKDD 2000, Proceedings
EditorsDjamel A. Zighed, Jan Komorowski, Jan Zytkow
PublisherSpringer Verlag
Pages265-276
Number of pages12
ISBN (Print)9783540410669
DOIs
Publication statusPublished - 1 Jan 2000
Externally publishedYes
Event4th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2000 - Lyon, France
Duration: 13 Sept 200016 Sept 2000

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1910
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2000
Country/TerritoryFrance
CityLyon
Period13/09/0016/09/00

Fingerprint

Dive into the research topics of 'Quality scheme assessment in the clustering process'. Together they form a unique fingerprint.

Cite this