Statistical analysis of infectious disease data on networks

Li, Xuan (2015) Statistical analysis of infectious disease data on networks. Master's Thesis, University of Pittsburgh. (Unpublished)

Preview

PDF
Submitted Version
Download (1MB)

Abstract

Purpose
Infectious disease modeling has a long history in helping researchers to understand the complex spread pattern of infectious disease. Social contact networks and agent-based models can be used to conceptualize social contact pattern and spread process of infectious disease. The goal of this research is to investigate the relationship between network measurements and individual infection risk using statistical analysis.
Public Health significance
This research will help in gaining a better understanding of the important factors of infection risk in a population. Identification of central people may be used to inform building an efficient surveillance and prevention program.
Methods
Three social contact network models were used in this thesis, Erdos-Renyi network, Barabasi-Albert network and Jefferson County contact network using FRED platform. We simulated mild and severe epidemic outbreaks on them and calculated infection risk and infection speed of each individual. Network measurements, degree, betweenness centrality, closeness centrality, eigenvector centrality, PageRank, and clustering coefficient were measured on the ability to identify groups of different infection risk level and infection speed. Random Forest and variable importance were used to estimate the most important factors in predicting infection risk
Results
For Barabasi-Albert and Erdos-Renyi networks, centrality measurements are critical factors in identifying infection risk. Degree is the most important factor in Barabasi-Albert network while closeness and degree are the most important in the mild outbreak and severe outbreak respectively in the Erdos-Renyi network. Results of Jefferson County contact network in FRED find out the importance of location sizes. The highly clustered structure of location-based model makes betweenness centrality and clustering coefficient important in predicting infection risk.
Conclusion
Different network structures and characteristics of the disease will influence the importance of network measurements. Network structures also influence the correlations between network measurements. Random forest is a powerful tool for classifying infection risk. Centrality network measurements may help in identifying high infection risk people.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Li, Xuan	xul23@pitt.edu	XUL23	0000-0001-7300-9960

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	Marsh, Gary	gmarsh@pitt.edu	GMARSH
Committee Member	Grefenstette, John J.	gref@pitt.edu	GREF
Committee Member	Guclu, Hasan	guclu@pitt.edu	GUCLU
Committee Member	Kumar, Supriya	supriya@pitt.edu	SUPRIYA

Date:

28 September 2015

Date Type:

Publication

Defense Date:

29 June 2015

Approval Date:

28 September 2015

Submission Date:

24 July 2015

Access Restriction:

3 year -- Restrict access to University of Pittsburgh for a period of 3 years.

Number of Pages:

Institution:

University of Pittsburgh

Schools and Programs:

School of Public Health > Biostatistics

Degree:

MS - Master of Science

Thesis Type:

Master's Thesis

Refereed:

Yes

Uncontrolled Keywords:

Social Contact Network; Random Forest; Infectious Disease; Agent-Based Model;FRED;

Date Deposited:

28 Sep 2015 18:32

Last Modified:

01 Sep 2018 05:15

URI:

http://d-scholarship.pitt.edu/id/eprint/25754

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Statistical analysis of infectious disease data on networks

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds