Imputation-Based Q-learning for Optimizing Dynamic Treatment Regimes with Time-to-Event Data

Lyu, Lingyun (2023) Imputation-Based Q-learning for Optimizing Dynamic Treatment Regimes with Time-to-Event Data. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

This is the latest version of this item.

Preview

PDF
Download (1MB) | Preview

Abstract

An optimal dynamic treatment regime (DTR) is a sequence of treatment decisions that yields the best expected outcome. Limited work has been reported for estimating optimal DTRs for survival outcome with right-censoring, and most of the existing work used parametric models. In this dissertation, we propose a new statistical method to estimate optimal DTR with right-censored survival outcomes and extend this method to the competing risks framework.

In the first project, we propose an imputation-based Q-learning (IQ-learning) method for
optimizing DTRs in multi-stage decision making, where a semiparametric Cox proportional hazard model is employed to estimate optimal treatment rules for each stage and then
weighted hot-deck multiple imputation (MI) and direct-draw MI are used to predict optimal potential survival times. We extend the proposed optimal DTR estimation methods to an
incomplete-data setting. Missing data are handled using inverse probability weighting and MI. We investigate the performance of IQ-learning via extensive simulations and show that it is robust to model mis-specification, imputes only plausible potential survival times contrary to parametric models, and provides more flexibility in terms of baseline hazard shape. We demonstrate IQ-Learning by developing an optimal DTR for leukemia treatment based on a
randomized trial with observational follow-up.

In the second project, we extend the proposed IQ-learning method to identify DTRs that optimize right-censored competing risks outcomes. Similar to IQ-learning, in the optimization step, we use the Cox model on cause-specific hazard functions to estimate the optimal treatment rule for each stage. Then, in the post-optimization prediction step, we propose three hot-deck MI-based methods to predict the counterfactual competing risk times for those who did not receive their optimal treatments. The performance of the proposed method is evaluated through simulation studies.

Public health significance: The statistical methods proposed in this dissertation contribute to the field of dynamic treatment regimes optimization, which has a substantial
impact on precision medicine, especially for the treatment of chronic diseases, such as cancer, AIDS, and depression. The estimated individualized treatment rules can guide and shape health policies which will ultimately improve overall public health and safety.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Lyu, Lingyun	lil114@pitt.edu	lil114

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	Wahed, S. Wahed	wahed@pitt.edu	wahed
Committee Member	Cheng, Yu	yucheng@pitt.edu	yucheng
Committee Member	Chang, Chung-Chou	changj@pitt.edu	changj
Committee Member	Tang, Gong	got1@pitt.edu	got1

Date:

6 January 2023

Date Type:

Publication

Defense Date:

18 November 2022

Approval Date:

6 January 2023

Submission Date:

6 January 2023

Access Restriction:

2 year -- Restrict access to University of Pittsburgh for a period of 2 years.

Number of Pages:

Institution:

University of Pittsburgh

Schools and Programs:

School of Public Health > Biostatistics

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

Cox proportional hazard model; Hot-deck multiple imputation; Optimal dynamic treatment regime; Competing risks. Precision medicine; Propensity score; Competing risks.

Date Deposited:

06 Jan 2023 20:51

Last Modified:

14 Feb 2025 13:54

URI:

http://d-scholarship.pitt.edu/id/eprint/44088

Available Versions of this Item

Imputation-Based Q-learning for Optimizing Dynamic Treatment Regimes with Time-to-Event Data. (deposited 06 Jan 2023 20:51) [Currently Displayed]

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Imputation-Based Q-learning for Optimizing Dynamic Treatment Regimes with Time-to-Event Data

Abstract

Share

Details

Available Versions of this Item

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds