Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Latent Variable Models for Analyses of Diagnostic Tests and Regression Analyses with Hierarchical Missing Covariates

Wang, Xianling (2021) Latent Variable Models for Analyses of Diagnostic Tests and Regression Analyses with Hierarchical Missing Covariates. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

This is the latest version of this item.

Download (437kB) | Preview


This dissertation concerns statistical analyses with latent variables under two scenarios. Many discrete diagnostic markers, such as breast cancer tumor grade, are important prognostic factors yet suffer from reproducibility because of their subjective nature. With multiple independent ratings, latent class models are the choice for statistical inference. However, model parameters are only estimable up to a permutation on the labels of the underlying truth. When an auxiliary variable associated with the underlying truth in a known trend is observed, we proposed a joint model that achieves global identification and yields more efficient estimates. Remedy to a specific violation of the conditional independence assumption in those classical models was also provided. The methods were illustrated in the analysis of a tumor grade reading dataset from the National Surgical Adjuvant Breast and Bowel Project (NSABP). The improved efficiency was also demonstrated through simulation studies.

The second part of this dissertation concerns regression analyses when a covariate is subject to missing values with a hierarchical missing data mechanism. In electronic health records (EHR) data, some important biomarkers such as lab test results are missing due to various reasons. Patients in remission are less likely to take those specialized tests. Furthermore, records of tested patients may be missing due to how the EHR data are assembled. In practice, the exact nature of such missingness is unavailable to the investigators. Standard methods such as the maximum likelihood method and inverse probability weighting typically ignore such heterogeneity and may produce biased estimates. We introduced a latent variable model to model the hierarchical missing data process and yield valid parameter estimates. The maximum likelihood method was used for estimation and inference. The proposed method was applied to a motivating EHR dataset from an inflammatory bowel disease registry at the University of Pittsburgh Medical Center. The performance of the proposed method was evaluated by simulation studies.

Public health significance: We proposed novel statistical methods to address missing data under two different scenarios. By yielding valid inference under those circumstances, application of the proposed methods has important public health implications.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Wang, Xianlingxiw118@pitt.eduxiw118
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairTang, Gonggot1@pitt.edugot1
Committee MemberKang, Chaeryoncrkang@pitt.educrkang
Committee MemberMcKennan, Christopher GordonCHM195@pitt.educhm195
Committee MemberTang, LuLUTANG@pitt.edulutang
Committee MemberYabes, Jonathan Guerrerojgy2@pitt.edujgy2
Date: 27 August 2021
Date Type: Publication
Defense Date: 14 July 2021
Approval Date: 27 August 2021
Submission Date: 5 August 2021
Access Restriction: 1 year -- Restrict access to University of Pittsburgh for a period of 1 year.
Number of Pages: 74
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: latent variables, latent class model, diagnostic tests, missing covariate
Date Deposited: 27 Aug 2021 17:59
Last Modified: 27 Aug 2022 05:15

Available Versions of this Item

  • Latent Variable Models for Analyses of Diagnostic Tests and Regression Analyses with Hierarchical Missing Covariates. (deposited 27 Aug 2021 17:59) [Currently Displayed]


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item