AUC optimization and the two-sample problem

Stephan Clémençon, Marine Depecker, Nicolas Vayatis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The purpose of the paper is to explore the connection between multivariate homogeneity tests and AUC optimization. The latter problem has recently received much attention in the statistical learning literature. From the elementary observation that, in the two-sample problem setup, the null assumption corresponds to the situation where the area under the optimal ROC curve is equal to 1/2, we propose a two-stage testing method based on data splitting. A nearly optimal scoring function in the AUC sense is first learnt from one of the two half-samples. Data from the remaining half-sample are then projected onto the real line and eventually ranked according to the scoring function computed at the first stage. The last step amounts to performing a standard Mann-Whitney Wilcoxon test in the one-dimensional framework. We show that the learning step of the procedure does not affect the consistency of the test as well as its properties in terms of power, provided the ranking produced is accurate enough in the AUC sense. The results of a numerical experiment are eventually displayed in order to show the efficiency of the method.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference
PublisherNeural Information Processing Systems
Pages360-368
Number of pages9
ISBN (Print)9781615679119
Publication statusPublished - 1 Jan 2009
Externally publishedYes
Event23rd Annual Conference on Neural Information Processing Systems, NIPS 2009 - Vancouver, BC, Canada
Duration: 7 Dec 200910 Dec 2009

Publication series

NameAdvances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference

Conference

Conference23rd Annual Conference on Neural Information Processing Systems, NIPS 2009
Country/TerritoryCanada
CityVancouver, BC
Period7/12/0910/12/09

Fingerprint

Dive into the research topics of 'AUC optimization and the two-sample problem'. Together they form a unique fingerprint.

Cite this