OBTAINING ACCURATE PROBABILITIES USING CLASSIFIER CALIBRATION

Pakdaman Naeini, Mahdi (2017) OBTAINING ACCURATE PROBABILITIES USING CLASSIFIER CALIBRATION. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

This is the latest version of this item.

Preview

PDF
Download (1MB) | Preview

Abstract

Learning probabilistic classification and prediction models that generate accurate probabilities is essential in many prediction and decision-making tasks in machine learning and data mining. One way to achieve this goal is to post-process the output of classification models to obtain more accurate probabilities. These post-processing methods are often referred to as calibration methods in the machine learning literature.

This thesis describes a suite of parametric and non-parametric methods for calibrating the output of classification and prediction models. In order to evaluate the calibration performance of a classifier, we introduce two new calibration measures that are intuitive statistics of the calibration
curves. We present extensive experimental results on both simulated and real datasets to evaluate the performance of the proposed methods compared with commonly used calibration methods in the literature. In particular, in terms of binary classifier calibration, our experimental results
show that the proposed methods are able to improve the calibration power of classifiers while retaining their discrimination performance. Our theoretical findings show that by using a simple non-parametric calibration method, it is possible to improve the calibration performance of a classifier
without sacrificing discrimination capability. The methods are also computationally tractable for large-scale datasets as they run in O(N log N) time, where N is the number of samples.

In this thesis we also introduce a novel framework to derive calibrated probabilities of causal relationships from observational data. The framework consists of three main components: (1) an approximate method for generating initial probability estimates of the edge types for each pair
of variables, (2) the availability of a relatively small number of the causal relationships in the network for which the truth status is known, which we call a calibration training set, and (3) a calibration method for using the approximate probability estimates and the calibration training set
to generate calibrated probabilities for the many remaining pairs of variables. Our experiments on a range of simulated data support that the proposed approach improves the calibration of edge predictions. The results also support that the approach often improves the precision and recall of those predictions.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Pakdaman Naeini, Mahdi	map218@pitt.edu	map218

ETD Committee:

Title	Member	Email Address
Committee Chair	Cooper, Gregory F	gfc@pitt.edu
Committee Member	Milos, Hauskrecht	milos@cs.pitt.edu
Committee Member	Visweswaran, Shyam	shv3@pitt.edu
Committee Member	Schneider, Jeff	schneide@cs.cmu.edu

Date:

27 January 2017

Date Type:

Publication

Defense Date:

5 August 2016

Approval Date:

27 January 2017

Submission Date:

28 November 2016

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Number of Pages:

150

Institution:

University of Pittsburgh

Schools and Programs:

Dietrich School of Arts and Sciences > Intelligent Systems

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

Classifier calibration, causality detection, Bayesian binning into Quantile(BBQ), Ensemble of Linear Trend Estimation(ELiTE), Ensemble of Near Isotonic Regression (ENIR)

Date Deposited:

27 Jan 2017 17:00

Last Modified:

28 Jan 2017 06:15

URI:

http://d-scholarship.pitt.edu/id/eprint/30526

Available Versions of this Item

OBTAINING ACCURATE PROBABILITIES USING CLASSIFIER CALIBRATION. (deposited 27 Jan 2017 17:00) [Currently Displayed]

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

OBTAINING ACCURATE PROBABILITIES USING CLASSIFIER CALIBRATION

Abstract

Share

Details

Available Versions of this Item

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds