Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Prediction of Apgar Score Using Statistical Learning

Oryshkewych, Nina (2022) Prediction of Apgar Score Using Statistical Learning. Master's Thesis, University of Pittsburgh. (Unpublished)

Download (2MB) | Preview


Background: Apgar score is a measure of neonatal health. A low Apgar score has been linked to several adverse health outcomes. Ambient air pollution has been shown to be a major threat to public health, but there is limited research on the relationship between maternal exposure to air pollution and Apgar score.
Methods: Maternal exposure to air pollution was calculated for each trimester and for each of the seven criteria air pollutants based on the nearest monitor to each mother’s residence. A combination of random over- and under-sampling was performed on the training data to balance the class distribution of Apgar score. Extreme gradient boosting (XGBoost) and logistic regression were used to build eight classification models – two using all predictors and six trimester-specific models.
Results: All models had poor discriminative ability. The best performing model was the XGBoost second trimester model, with an AUC of 0.627. In the XGBoost models, gestational age appeared to be the most important predictor of Apgar score, followed by the air pollution exposure variables. In the logistic regression models, gestational age was the most significant predictor.
Conclusion: Gestational age is the primary driver of Apgar score, and exposure to air pollution may be important as well. While none of the models had adequate predictive ability, there are a few limitations to this study that may have hindered their performance. Future research should consider more sophisticated resampling techniques as well as geospatial modelling of pollution concentrations in order to improve the quality of the data.
Public Health Significance: While many studies have investigated the consequences of a low Apgar score, existing research lacks in exploration of factors that influence Apgar score. This study suggests the possibility that exposure to ambient air pollution could be linked to a low five minute Apgar score. A classification model for Apgar score could guide practitioners and public health officials in implementing preventative measures to protect neonatal health.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Oryshkewych, Ninanso6@pitt.edunso6
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee MemberBuchanich, Jeaninejeanine@pitt.edujeanine
Committee MemberYouk, Adaayouk@pitt.eduayouk
Committee MemberCarlson, Jennajnc35@pitt.edujnc35
Committee MemberTalbott, Evelyneot1@pitt.edueot1
Thesis AdvisorBuchanich, Jeaninejeanine@pitt.edujeanine
Date: 12 May 2022
Date Type: Publication
Defense Date: 25 April 2022
Approval Date: 12 May 2022
Submission Date: 28 April 2022
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 86
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: Apgar score, air pollution, statistical learning, classification
Date Deposited: 12 May 2022 13:46
Last Modified: 12 May 2022 13:46


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item