Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

RANK-BASED TEMPO-SPATIAL CLUSTERING: A FRAMEWORK FOR RAPID OUTBREAK DETECTION USING SINGLE OR MULTIPLE DATA STREAMS

Que, Jialan (2012) RANK-BASED TEMPO-SPATIAL CLUSTERING: A FRAMEWORK FOR RAPID OUTBREAK DETECTION USING SINGLE OR MULTIPLE DATA STREAMS. Doctoral Dissertation, University of Pittsburgh.

[img]
Preview
PDF
Primary Text

Download (3MB) | Preview

Abstract

In the recent decades, algorithms for disease outbreak detection have become one of the main interests of public health practitioners to identify and localize an outbreak as early as possible in order to warrant further public health response before a pandemic develops. Today’s increased threat of biological warfare and terrorism provide an even stronger impetus to develop methods for outbreak detection based on symptoms as well as definitive laboratory diagnoses.

In this dissertation work, I explore the problems of rapid disease outbreak detection using both spatial and temporal information. I develop a framework of non-parameterized algorithms which search for patterns of disease outbreak in spatial sub-regions of the monitored region within a certain period. Compared to the current existing spatial or tempo-spatial algorithm, the algorithms in this framework provide a methodology for fast searching of either univariate data set or multivariate data set. It first measures which study area is more likely to have an outbreak occurring given the baseline data and currently observed data. Then it applies a greedy searching mechanism to look for clusters with high posterior probabilities given the risk measurement for each unit area as heuristic. I also explore the performance of the proposed algorithms.

From the perspective of predictive modeling, I adopt a Gamma-Poisson (GP) model to compute the probability of having an outbreak in each cluster when analyzing univariate data. I build a multinomial generalized Dirichlet (MGD) model to identify outbreak clusters from multivariate data which include the OTC data streams collected by the national retail data monitor (NRDM) and the ED data streams collected by the RODS system.

Key contributions of this dissertation include 1) it introduces a rank-based tempo-spatial clustering algorithm, RSC, by utilizing greedy searching and Bayesian GP model for disease outbreak detection with comparable detection timeliness, cluster positive prediction value (PPV) and improved running time; 2) it proposes a multivariate extension of RSC (MRSC) which applies MGD model. The evaluation demonstrated the advantage that MGD model can effectively suppress the false alarms caused by elevated signals that are non-disease relevant and occur in all the monitored data streams.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Published
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Que, Jialanjiq4@pitt.eduJIQ4
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairTsui, Fu-Chiangtsui2@pitt.eduTSUI2
Committee MemberCooper, Gregorygfc@pitt.eduGFC
Committee MemberDay, Rogerday01@pitt.eduDAY01
Committee MemberHauskrecht, Milosmilos@pitt.eduMILOS
Date: 2 October 2012
Date Type: Publication
Defense Date: 13 April 2012
Approval Date: 2 October 2012
Submission Date: 16 April 2012
Access Restriction: 1 year -- Restrict access to University of Pittsburgh for a period of 1 year.
Number of Pages: 161
Institution: University of Pittsburgh
Schools and Programs: Dietrich School of Arts and Sciences > Intelligent Systems
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: spatial scan, disease surveillance, algorithm, disease outbreak detection
Date Deposited: 02 Oct 2013 05:00
Last Modified: 15 Nov 2016 13:57
URI: http://d-scholarship.pitt.edu/id/eprint/11842

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item