Nonparametric estimation of the precision-recall curve

Stéphan Clémençon, Nicolas Vayatis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The Precision-Recall (PR) curve is a widely used visual tool to evaluate the performance of scoring functions in regards to their capacities to discriminate between two populations. The purpose of this paper is to examine both theoretical and practical issues related to the statistical estimation of PR curves based on classification data. Consistency and asymptotic normality of the empirical counterpart of the PR curve in sup norm are rigorously established. Eventually, the issue of building confidence bands in the PR space is considered and a specific resampling procedure based on a smoothed and truncated version of the empirical distribution of the data is promoted. Arguments of theoretical and computational nature are presented to explain why such a bootstrap is preferable to a "naive" bootstrap in this setup.

Original languageEnglish
Title of host publicationProceedings of the 26th Annual International Conference on Machine Learning, ICML'09
DOIs
Publication statusPublished - 15 Sept 2009
Externally publishedYes
Event26th Annual International Conference on Machine Learning, ICML'09 - Montreal, QC, Canada
Duration: 14 Jun 200918 Jun 2009

Publication series

NameACM International Conference Proceeding Series
Volume382

Conference

Conference26th Annual International Conference on Machine Learning, ICML'09
Country/TerritoryCanada
CityMontreal, QC
Period14/06/0918/06/09

Fingerprint

Dive into the research topics of 'Nonparametric estimation of the precision-recall curve'. Together they form a unique fingerprint.

Cite this