Overlaying Classifiers: A Practical Approach to Optimal Scoring

Stéphan Clémençon, Nicolas Vayatis

Research output: Contribution to journalArticlepeer-review

Abstract

The Receiver Operating Characteristic (ROC) curve is one of the most widely used visual tools to evaluate performance of scoring functions regarding their capacities to discriminate between two populations. It is the goal of this paper to propose a statistical learning method for constructing a scoring function with nearly optimal ROC curve. In this bipartite setup, the target is known to be the regression function up to an increasing transform, and solving the optimization problem boils down to recovering the collection of level sets of the latter, which we interpret here as a continuum of imbricated classification problems. We propose a discretization approach, consisting of building a finite sequence of N classifiers by constrained empirical risk minimization and then constructing a piecewise constant scoring function sN(x) by overlaying the resulting classifiers. Given the functional nature of the ROC criterion, the accuracy of the ranking induced by sN(x) can be conceived in a variety of ways, depending on the distance chosen for measuring closeness to the optimal curve in the ROC space. By relating the ROC curve of the resulting scoring function to piecewise linear approximates of the optimal ROC curve, we establish the consistency of the method as well as rate bounds to control its generalization ability in sup-norm. Eventually, we also highlight the fact that, as a byproduct, the algorithm proposed provides an accurate estimate of the optimal ROC curve.

Original languageEnglish
Pages (from-to)619-648
Number of pages30
JournalConstructive Approximation
Volume32
Issue number3
DOIs
Publication statusPublished - 1 Jan 2010

Keywords

  • AUC criterion
  • Bipartite ranking
  • Density level set
  • Minimum volume set estimation
  • Piecewise linear approximation
  • ROC curve
  • Scoring function
  • Statistical learning
  • Sup-norm

Fingerprint

Dive into the research topics of 'Overlaying Classifiers: A Practical Approach to Optimal Scoring'. Together they form a unique fingerprint.

Cite this