Social content matching in MapReduce

Gianmarco De Francisci Morales, Aristides Gionis, Mauro Sozio

Research output: Contribution to journalArticlepeer-review

Abstract

Matching problems are ubiquitous. They occur in economic markets, labor markets, internet advertising, and elsewhere. In this paper we focus on an application of matching for social media. Our goal is to distribute content from information suppliers to information consumers. We seek to maximize the overall relevance of the matched content from suppliers to consumers while regulating the overall activity, e.g., ensuring that no consumer is overwhelmed with data and that all suppliers have chances to deliver their content. We propose two matching algorithms, GreedyMR and StackMR, geared for the MapReduce paradigm. Both algorithms have provable approximation guarantees, and in practice they produce high-quality solutions. While both algorithms scale extremely well, we can show that Stack-MR requires only a poly-logarithmic number of MapReduce steps, making it an attractive option for applications with very large datasets. We experimentally show the trade-offs between quality and efficiency of our solutions on two large datasets coming from real-world social-media web sites.

Original languageEnglish
Pages (from-to)460-469
Number of pages10
JournalProceedings of the VLDB Endowment
Volume4
Issue number7
DOIs
Publication statusPublished - 1 Jan 2011
Externally publishedYes

Fingerprint

Dive into the research topics of 'Social content matching in MapReduce'. Together they form a unique fingerprint.

Cite this