Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Exploring the Additive Effects of Religious Participation on Multivariate, Demographics Based Machine Learning Models

Jacobs, Ian Michael (2024) Exploring the Additive Effects of Religious Participation on Multivariate, Demographics Based Machine Learning Models. Master's Thesis, University of Pittsburgh. (Unpublished)

Download (3MB) | Preview


Through the 21st century, vaccine hesitancy has had a significant effect on the implementation of vaccine development and rollout in the United States. A known and well documented factor that contributes to this kind of structural hesitancy is regular participation in a religious congregation or community whose doctrine or teachings condemn vaccination and/or modern medicine in some form. The public health contribution of this thesis is to support the use of machine learning in the prediction of public health outcomes, as well as promote the contribution of socially anchored metrics within demographics-based models.
Data for this project was sourced from The Department of Health and Human Services Office of the Assistant Secretary for Planning and Evaluation, The U.S. Department of Agriculture’s Economic Research Survey, and The Association of Statisticians of American Religious Bodies’ U.S. Religion Census. These data were cleaned at the U.S. county level and the remaining variables were categorized into six major demographic categories: education, population, poverty, unemployment, vaccine hesitancy, and religious participation. This cleaning process resulted in 54 usable demographic variables and one outcome variable.
After data cleaning was performed, four machine learning techniques were implemented on the variable set to compare their prediction ability: elastic net, multivariate adaptive regression splines, random forest, and gradient boosted trees. Using the root mean square error and R-squared of each of these models, it was determined that the gradient boosted trees method had the greatest prediction ability with this particular dataset.
Variable selection was performed, and it was determined through importance testing that 26 of the 54 variables had a significant contribution to the model and provided the most substantial prediction ability. Of those 26 variables, two originated from the religion category. Results from the gradient boosted tree analysis indicated a decrease in prediction ability when the selected religion variables were removed from the model, which supports a data-based linkage between vaccine hesitancy and religious participation. Post-hoc hierarchical clustering was performed at a county level to give a visual representation of the demographically constructed clusters and to provide a geographically based comparison between the selected demographics and vaccine hesitancy.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Jacobs, Ian Michaelimj14@pitt.eduimj140009-0006-0573-9455
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairYouk, Adaayouk@pitt.eduayouk
Committee MemberGrant, HaleyHALEYG@pitt.eduHALEYG
Committee MemberShah, Nileshnhs3@pitt.edunhs3
Committee MemberTang, LuLUTANG@pitt.eduLUTANG
Date: 14 May 2024
Date Type: Publication
Defense Date: 3 April 2024
Approval Date: 14 May 2024
Submission Date: 23 April 2024
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 75
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: N/A
Date Deposited: 14 May 2024 19:04
Last Modified: 14 May 2024 19:04


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item