Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Algorithms, applications and systems towards interpretable pattern mining from multi-aspect data

Wen, Xidao (2020) Algorithms, applications and systems towards interpretable pattern mining from multi-aspect data. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

[img]
Preview
PDF
Download (47MB) | Preview

Abstract

How do humans move around in the urban space and how do they differ when the city undergoes terrorist attacks? How do users behave in Massive Open Online courses~(MOOCs) and how do they differ if some of them achieve certificates while some of them not? What areas in the court elite players, such as Stephen Curry, LeBron James, like to make their shots in the course of the game? How can we uncover the hidden habits that govern our online purchases? Are there unspoken agendas in how different states pass legislation of certain kinds? At the heart of these seemingly unconnected puzzles is this same mystery of multi-aspect mining, i.g., how can we mine and interpret the hidden pattern from a dataset that simultaneously reveals the associations, or changes of the associations, among various aspects of the data (e.g., a shot could be described with three aspects, player, time of the game, and area in the court)? Solving this problem could open gates to a deep understanding of underlying mechanisms for many real-world phenomena. While much of the research in multi-aspect mining contribute broad scope of innovations in the mining part, interpretation of patterns from the perspective of users (or domain experts) is often overlooked. Questions like what do they require for patterns, how good are the patterns, or how to read them, have barely been addressed. Without efficient and effective ways of involving users in the process of multi-aspect mining, the results are likely to lead to something difficult for them to comprehend.

This dissertation proposes the M^3 framework, which consists of multiplex pattern discovery, multifaceted pattern evaluation, and multipurpose pattern presentation, to tackle the challenges of multi-aspect pattern discovery. Based on this framework, we develop algorithms, applications, and analytic systems to enable interpretable pattern discovery from multi-aspect data. Following the concept of meaningful multiplex pattern discovery, we propose PairFac to close the gap between human information needs and naive mining optimization. We demonstrate its effectiveness in the context of impact discovery in the aftermath of urban disasters. We develop iDisc to target the crossing of multiplex pattern discovery with multifaceted pattern evaluation. iDisc meets the specific information need in understanding multi-level, contrastive behavior patterns. As an example, we use iDisc to predict student performance outcomes in Massive Open Online Courses given users' latent behaviors. FacIt is an interactive visual analytic system that sits at the intersection of all three components and enables for interpretable, fine-tunable, and scrutinizable pattern discovery from multi-aspect data. We demonstrate each work's significance and implications in its respective problem context. As a whole, this series of studies is an effort to instantiate the M^3 framework and push the field of multi-aspect mining towards a more human-centric process in real-world applications.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Wen, Xidaoxiw55@pitt.eduxiw55
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairLin, Yu-Ruyuruliny@gmail.com
Committee MemberBrusilovsky, Peterpbrusilovsky@gmail.com
Committee MemberPelechrinis, Konstantinoskostas.pelechrinis@gmail.com
Committee MemberFaloutsos, Christoschristos@cs.cmu.edu
Date: 23 January 2020
Date Type: Publication
Defense Date: 14 October 2019
Approval Date: 23 January 2020
Submission Date: 28 November 2019
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 198
Institution: University of Pittsburgh
Schools and Programs: School of Computing and Information > Information Science
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Machine Learning, Data Mining, Interpretability
Date Deposited: 23 Jan 2020 19:02
Last Modified: 23 Jan 2020 19:02
URI: http://d-scholarship.pitt.edu/id/eprint/37907

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item