Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Weighted machine learning for spatial-temporal data

Hashemi, Mahdi (2018) Weighted machine learning for spatial-temporal data. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

This is the latest version of this item.

[img] PDF
Restricted to University of Pittsburgh users only until 24 January 2023.

Download (1MB) | Request a Copy

Abstract

Sometimes not all training samples are equal in supervised machine learning due to their different accuracy, reliability, source, relevance, or other reasons. Non-weighted machine learning techniques are designed for equally important training samples: (a) the cost of misclassification is equal for training samples in parametric classification techniques, (b) residuals are equally important in parametric regression models, and (c) when voting in non-parametric classification and regression models, training samples either have equal weights or their weights are determined internally by kernels in the feature space, thus no external weights. In this thesis, we develop the weighted versions of Bayesian predictor, perceptron, multilayer perceptron, SVM, and decision tree and show how their results would be different from their non-weighted versions.
Applying machine learning techniques to spatial-temporal data poses the question that how the recorded location and time for training samples should contribute to the training and testing process. The prior knowledge of how spatial-temporal phenomena are autocorrelated cannot be properly captured by machine learning techniques which either ignore location and time altogether, or consider them as input features. Not to mention that the latter approach leads to increased sparseness of data in the feature space and more free parameters in the predictor; thus demanding for larger training datasets. We use the prior knowledge about the spatial-temporal autocorrelation to determine how relevant each training sample would be, given its spatial and temporal distances to the irresponsive (unlabeled) sample. Weighted machine learning techniques use this prior knowledge by taking the relevance of training samples with regard to the irresponsive sample into account as training samples’ weights. The proposed approach overcomes the aforementioned issues by enriching the training process with the prior knowledge about spatial-temporal autocorrelation. Because the spatial-temporal weight of training samples depends on the irresponsive sample’s location and time, the machine needs to be trained separately for each irresponsive sample. However, we show that in practice using only a small subset of training samples with largest spatial-temporal weights not only mitigates the training time but also results in the best accuracy in most cases.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Hashemi, Mahdimmh75@pitt.edummh75
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Thesis AdvisorKarimi, Hassan A.hkarimi@pitt.eduhkarimi
Committee MemberKrishnamurthy, Prashantprashant@sis.pitt.eduprashant
Committee MemberLi, Ching-Chungccl@pitt.educcl
Committee MemberMunro, Paulpmunro@sis.pitt.edupmunro
Date: 24 January 2018
Date Type: Publication
Defense Date: 25 September 2017
Approval Date: 24 January 2018
Submission Date: 25 November 2017
Access Restriction: 5 year -- Restrict access to University of Pittsburgh for a period of 5 years.
Number of Pages: 131
Institution: University of Pittsburgh
Schools and Programs: School of Computing and Information > Information Science
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Machine learning, Spatial-temporal data, Analytical learning
Date Deposited: 24 Jan 2018 16:27
Last Modified: 24 Jan 2018 16:27
URI: http://d-scholarship.pitt.edu/id/eprint/33695

Available Versions of this Item

  • Weighted machine learning for spatial-temporal data. (deposited 24 Jan 2018 16:27) [Currently Displayed]

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item