Ranking forests

Stèphan Clèmencon, Marine Depecker, Nicolas Vayatis

Research output: Contribution to journalArticlepeer-review

Abstract

The present paper examines how the aggregation and feature randomization principles underlying the algorithm RANDOM FOREST (Breiman, 2001) can be adapted to bipartite ranking. The approach taken here is based on nonparametric scoring and ROC curve optimization in the sense of the AUC criterion. In this problem, aggregation is used to increase the performance of scoring rules produced by ranking trees, as those developed in Clemencon and Vayatis (2009c). The present work describes the principles for building median scoring rules based on concepts from rank aggregation. Consistency results are derived for these aggregated scoring rules and an algorithm called RANKING FOREST is presented. Furthermore, various strategies for feature randomization are explored through a series of numerical experiments on artificial data sets.

Original languageEnglish
Pages (from-to)39-73
Number of pages35
JournalJournal of Machine Learning Research
Volume14
Issue number1
Publication statusPublished - 1 Jan 2013
Externally publishedYes

Keywords

  • AUC criterion
  • Bagging
  • Bipartite ranking
  • Bootstrap
  • Classification data
  • Feature randomization
  • Median ranking
  • Nonparametric scoring
  • ROC optimization
  • Rank aggregation
  • Tree-based ranking rules

Fingerprint

Dive into the research topics of 'Ranking forests'. Together they form a unique fingerprint.

Cite this