NON-PARAMETRIC GRAPH-BASED METHODS FOR LARGE SCALE PROBLEMS

Amizadeh, Saeed (2014) NON-PARAMETRIC GRAPH-BASED METHODS FOR LARGE SCALE PROBLEMS. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Accepted Version
Download (3MB) | Preview

Abstract

The notion of similarity between observations plays a very fundamental role in many Machine Learning and Data Mining algorithms. In
many of these methods, the fundamental problem of prediction, which is making assessments and/or inferences about the future observations from the
past ones, boils down to how ``similar'' the future cases are to the already observed ones. However, similarity is not always
obtained through the traditional distance metrics. Data-driven similarity metrics, in particular, come into play where the traditional absolute
metrics are not sufficient for the task in hand due to special structure of the observed data. A common approach for computing data-driven similarity
is to somehow
aggregate the local absolute similarities (which are not data-driven and can be computed in a closed-from) to infer a global
data-driven similarity value between any pair of observations. The graph-based methods offer a natural framework to do so.
Incorporating these methods, many of the Machine Learning algorithms, that are
designed to work with absolute distances, can be applied on those problems with data-driven distances. This makes graph-based methods
very effective tools for many real-world problems.

In this thesis, the major problem that I want to address is the scalability of the graph-based methods. With the rise of large-scale,
high-dimensional datasets in many real-world applications, many Machine Learning algorithms do not scale up well applying to these problems.
The graph-based methods are no exception
either. Both the large number of observations and the high dimensionality hurt graph-based methods,
computationally and statistically. While the large number of observations imposes more of a computational problem, the high dimensionality
problem has more of a statistical nature. In this thesis, I address both of these issues in depth and review the common solutions for them proposed
in the literature. Moreover, for each of these problems, I propose novel solutions with experimental results depicting the merits of the proposed
algorithms. Finally, I discuss the contribution of the proposed work from a broader viewpoint and draw some future directions of the current work.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Amizadeh, Saeed	saeed.amizadeh@gmail.com

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	Hauskrecht, Milos	milos@cs.pitt.edu	MILOS
Committee Member	Cooper, Gregory F	gfc@cbmi.pitt.edu	GFC
Committee Member	Druzdzel, Marek J.	marek@sis.pitt.edu	DRUZDZEL
Committee Member	Chennubhotla, Chakra	chakracs@pitt.edu	CHAKRACS
Committee Member	Visweswaran, Shyam	shv3@pitt.edu	SHV3
Committee Member	Nugent, Rebecca	rnugent@stat.cmu.edu

Date:

28 January 2014

Date Type:

Publication

Defense Date:

28 August 2013

Approval Date:

28 January 2014

Submission Date:

29 October 2013

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Number of Pages:

186

Institution:

University of Pittsburgh

Schools and Programs:

Dietrich School of Arts and Sciences > Intelligent Systems

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

Machine Learning, Data-Driven Similarity, Graph-based Methods, Spectral Graph-based Methods, Non-parametric Methods, Large-Scale Problems, N-body Problems, Curse of Dimensionality, Label Propagation, Semi-supervised Learning, Spectral Clustering

Date Deposited:

28 Jan 2014 16:25

Last Modified:

15 Nov 2016 14:15

URI:

http://d-scholarship.pitt.edu/id/eprint/19949

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

NON-PARAMETRIC GRAPH-BASED METHODS FOR LARGE SCALE PROBLEMS

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds