Confident Interpretations of Black Box Classifiers

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deep Learning models provide state of the art classification results, but are not human-interpretable. We propose a novel method to interpret the classification results of a black box model a posteriori. We emulate the complex classifier by surrogate decision trees. Each tree mimics the behavior of the complex classifier by overestimating one of the classes. This yields a global, interpretable approximation of the black box classifier. Our method provides interpretations that are at the same time general (applying to many data points), confident (generalizing well to other data points), faithful to the original model (making the same predictions), and simple (easy to understand). Our experiments show that our method beats competing methods in these desiderata, and our user study shows that users prefer this type of interpretations over others.

Original languageEnglish
Title of host publicationIJCNN 2021 - International Joint Conference on Neural Networks, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9780738133669
DOIs
Publication statusPublished - 18 Jul 2021
Event2021 International Joint Conference on Neural Networks, IJCNN 2021 - Virtual, Online, China
Duration: 18 Jul 202122 Jul 2021

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2021-July
ISSN (Print)2161-4393
ISSN (Electronic)2161-4407

Conference

Conference2021 International Joint Conference on Neural Networks, IJCNN 2021
Country/TerritoryChina
CityVirtual, Online
Period18/07/2122/07/21

Fingerprint

Dive into the research topics of 'Confident Interpretations of Black Box Classifiers'. Together they form a unique fingerprint.

Cite this