A simple mixture model for unsupervised text categorisation

F. Clérot, F. Fessant, O. Collin, O. Cappé, E. Moulines

Research output: Contribution to journalConference articlepeer-review

Abstract

Automatically segmenting text corpora into thematically related groups is a complex exploratory analysis problem. In this article, we outline our multi-stage exploratory analysis process and investigate the performance of a simple statistical model. After a description of this model and of its fitting procedure, we illustrate its performance on the segmentation of a corpus of CKM-related texts in English.

Original languageEnglish
Pages (from-to)13-22
Number of pages10
JournalManagement Information Systems
Volume10
Publication statusPublished - 1 Dec 2004
EventFifth International Conference on Data Mining, DATA MINING V - Malaga, Spain
Duration: 15 Sept 200417 Sept 2004

Keywords

  • Clustering
  • Exploratory analysis
  • Mixture model
  • Text mining

Fingerprint

Dive into the research topics of 'A simple mixture model for unsupervised text categorisation'. Together they form a unique fingerprint.

Cite this