Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Imputation-Based Q-learning for Optimizing Dynamic Treatment Regimes with Time-to-Event Data

Lyu, Lingyun (2023) Imputation-Based Q-learning for Optimizing Dynamic Treatment Regimes with Time-to-Event Data. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

This is the latest version of this item.

[img] PDF
Restricted to University of Pittsburgh users only until 6 January 2025.

Download (1MB) | Request a Copy


An optimal dynamic treatment regime (DTR) is a sequence of treatment decisions that yields the best expected outcome. Limited work has been reported for estimating optimal DTRs for survival outcome with right-censoring, and most of the existing work used parametric models. In this dissertation, we propose a new statistical method to estimate optimal DTR with right-censored survival outcomes and extend this method to the competing risks framework.

In the first project, we propose an imputation-based Q-learning (IQ-learning) method for
optimizing DTRs in multi-stage decision making, where a semiparametric Cox proportional hazard model is employed to estimate optimal treatment rules for each stage and then
weighted hot-deck multiple imputation (MI) and direct-draw MI are used to predict optimal potential survival times. We extend the proposed optimal DTR estimation methods to an
incomplete-data setting. Missing data are handled using inverse probability weighting and MI. We investigate the performance of IQ-learning via extensive simulations and show that it is robust to model mis-specification, imputes only plausible potential survival times contrary to parametric models, and provides more flexibility in terms of baseline hazard shape. We demonstrate IQ-Learning by developing an optimal DTR for leukemia treatment based on a
randomized trial with observational follow-up.

In the second project, we extend the proposed IQ-learning method to identify DTRs that optimize right-censored competing risks outcomes. Similar to IQ-learning, in the optimization step, we use the Cox model on cause-specific hazard functions to estimate the optimal treatment rule for each stage. Then, in the post-optimization prediction step, we propose three hot-deck MI-based methods to predict the counterfactual competing risk times for those who did not receive their optimal treatments. The performance of the proposed method is evaluated through simulation studies.

Public health significance: The statistical methods proposed in this dissertation contribute to the field of dynamic treatment regimes optimization, which has a substantial
impact on precision medicine, especially for the treatment of chronic diseases, such as cancer, AIDS, and depression. The estimated individualized treatment rules can guide and shape health policies which will ultimately improve overall public health and safety.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Lyu, Lingyunlil114@pitt.edulil114
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairWahed, S. Wahedwahed@pitt.eduwahed
Committee MemberCheng, Yuyucheng@pitt.eduyucheng
Committee MemberChang, Chung-Chouchangj@pitt.educhangj
Committee MemberTang, Gonggot1@pitt.edugot1
Date: 6 January 2023
Date Type: Publication
Defense Date: 18 November 2022
Approval Date: 6 January 2023
Submission Date: 6 January 2023
Access Restriction: 2 year -- Restrict access to University of Pittsburgh for a period of 2 years.
Number of Pages: 82
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Cox proportional hazard model; Hot-deck multiple imputation; Optimal dynamic treatment regime; Competing risks. Precision medicine; Propensity score; Competing risks.
Date Deposited: 06 Jan 2023 20:51
Last Modified: 06 Jan 2023 20:51

Available Versions of this Item

  • Imputation-Based Q-learning for Optimizing Dynamic Treatment Regimes with Time-to-Event Data. (deposited 06 Jan 2023 20:51) [Currently Displayed]


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item