Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Deep Learning for Investigating Causal Effects with High-Dimensional Data: Analytic Tools and Applications to Educational Interventions

Guzman-Alvarez, Alberto (2023) Deep Learning for Investigating Causal Effects with High-Dimensional Data: Analytic Tools and Applications to Educational Interventions. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Download (870kB) | Preview


Recent developments in machine learning have the potential to revolutionize quantitative education
research. However, realizing this potential requires bridging the worlds of educational research and computer
science. In my dissertation, I merged advances in deep learning and causal inference to enable researchers
to assess program impacts using quasi-experimental methods with high-dimensional data.
My first dissertation paper proposes a new analytical procedure that incorporates deep neural networks
to estimate propensity scores, which flexibly accommodate high-dimensional data and complex relationships
between treatment selection and observable characteristics using propensity score weighting. In my analysis,
I find that while logistic regression leads to low bias and small standard errors in the estimated average
treatment effect in a low-dimensional data setting, machine learning approaches, particularly my deep neural
network approach and bagged-CART, perform better in the high-dimensional settings.
In addition to the methodological contributions, my dissertation makes substantive contributions
to the applied literature. In my second dissertation study, I evaluate a large-scale A.I. chatbot college
access intervention that offered critical supports to historically and economically marginalized high school
students to ease their transition into college during the COVID-19 pandemic. The study sheds light on the
intervention’s effectiveness and its potential for improving educational outcomes during the pandemic.
Overall, this dissertation advances the field by demonstrating the potential of machine learning and
causal inference methods to advance quantitative education research. It provides a new approach for estimating
propensity scores that can be used in high-dimensional settings, thereby improving the accuracy and
reliability of impact assessments. The findings from the evaluation of the college access intervention offer
important insights into how such programs can support students during challenging times and improve their
educational outcomes, particularly for those who face systemic barriers.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Guzman-Alvarez, Albertoalg223@pitt.edualg2230000-0002-1027-6612
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee CoChairPage,
Committee CoChairQin,
Committee MemberScott,
Committee MemberCorrenti,
Date: 22 May 2023
Date Type: Publication
Defense Date: 28 March 2023
Approval Date: 22 May 2023
Submission Date: 21 April 2023
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 158
Institution: University of Pittsburgh
Schools and Programs: School of Education > Learning Sciences and Policy
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: propensity score analysis, causal inference, education, machine learning, nudges, college access
Date Deposited: 22 May 2023 12:51
Last Modified: 22 May 2023 12:51


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item