Adaptive Window Strategy for Topic Modeling in Document Streams

Pierre Alexandre Murena, Marie Al-Ghossein, Talel Abdessalem, Antoine Cornuejols

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Extracting global themes from a written text has recently become a major issue for computational intelligence, in particular in Natural Language Processing communities. Among all proposed solutions, Latent Dirichlet Allocation (LDA) has gained a vast interest and several variants have been proposed to adapt to changing environments. With the emergence of data streams, for instance from social media, the domain faces a new challenge: Topic extraction in real time. In this paper, we propose a simple approach called Adaptive Window based Incremental LDA (AWILDA) originating from the cross-over between LDA and state-of-the-art methods in data stream mining. We train new topic models only when a drift is detected and select training data on the fly using ADWIN algorithm. We provide both theoretical guarantees for our method and experimental validation on artificial and real-world data.

Original languageEnglish
Title of host publication2018 International Joint Conference on Neural Networks, IJCNN 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781509060146
DOIs
Publication statusPublished - 10 Oct 2018
Externally publishedYes
Event2018 International Joint Conference on Neural Networks, IJCNN 2018 - Rio de Janeiro, Brazil
Duration: 8 Jul 201813 Jul 2018

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2018-July

Conference

Conference2018 International Joint Conference on Neural Networks, IJCNN 2018
Country/TerritoryBrazil
CityRio de Janeiro
Period8/07/1813/07/18

Fingerprint

Dive into the research topics of 'Adaptive Window Strategy for Topic Modeling in Document Streams'. Together they form a unique fingerprint.

Cite this