Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Learning of classification models from group-based feedback

Luo, Zhipeng (2020) Learning of classification models from group-based feedback. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

[img]
Preview
PDF
Download (2MB) | Preview

Abstract

Learning of classification models in practice often relies on a nontrivial amount of human annotation effort. The most widely adopted human labeling process assigns class labels to individual data instances. However, such a process is very rigid and may end up being very time-consuming and costly to conduct in practice. Finding more effective ways to reduce human annotation effort has become critical for building machine learning systems that require human feedback.

In this thesis, we propose and investigate a new machine learning approach - Group-Based Active Learning - to learn classification models from limited human feedback. A group is defined by a set of instances represented by conjunctive patterns that are value ranges over the input features. Such conjunctive patterns define hypercubic regions of the input data space. A human annotator assesses the group solely based on its region-based description by providing an estimate of the class proportion for the subpopulation covered by the region. The advantage of this labeling process is that it allows a human to label many instances at the same time, which can, in turn, improve the labeling efficiency.

In general, there are infinitely many regions one can define over a real-valued input space. To identify and label groups/regions important for classification learning, we propose and develop a Hierarchical Active Learning framework that actively builds and labels a hierarchy of input regions. Briefly, our framework starts by identifying general regions covering substantial portions of the input data space. After that, it progressively splits the regions into smaller and smaller sub-regions and also acquires class proportion labels for the new regions. The proportion labels for these regions are used to gradually improve and refine a classification model induced by the regions. We develop three versions of the idea. The first two versions aim to build a single hierarchy of regions. One builds it statically using hierarchical clustering, while the other one builds it dynamically, similarly to the decision tree learning process. The third approach builds multiple hierarchies simultaneously, and it offers additional flexibility for identifying more informative and simpler regions. We have conducted comprehensive empirical studies to evaluate our framework. The results show that the methods based on the region-based active learning can learn very good classifiers from a very few and simple region queries, and hence are promising for reducing human annotation effort needed for building a variety of classification models.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Luo, ZhipengZHL78@pitt.eduZHL78
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairHauskrecht, Milosmilos@pitt.edumilos
Committee MemberKovashka, Adrianakovashka@cs.pitt.edukovashka
Committee MemberHwa, Rebeccareh23@pitt.edureh23
Committee MemberRen, Zhaozren@pitt.eduzren
Date: 20 August 2020
Date Type: Publication
Defense Date: 23 June 2020
Approval Date: 20 August 2020
Submission Date: 7 August 2020
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 139
Institution: University of Pittsburgh
Schools and Programs: School of Computing and Information > Computer Science
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Active Learning, Learning from Label Proportions, Weakly Supervised Learning, Decision Tree Learning
Date Deposited: 20 Aug 2020 18:42
Last Modified: 20 Aug 2020 18:42
URI: http://d-scholarship.pitt.edu/id/eprint/39580

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item