Pleis, John
(2018)
Mixtures of discrete and continuous variables: considerations for dimension reduction.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
For this dissertation, we will examine mixtures of different types of data, the analytic challenges that such data can present, and some approaches for addressing this issue. Specifically, we will
consider mixtures of continuous and discrete data. For the theoretical developments that follow, we will focus on the general location model (GLOM)-based methodology for deriving the joint probability distribution of continuous and discrete random variables as the product of conditional and marginal probability distributions. As we will show, the general specification of this joint distribution is a finite mixture of Gaussian distributions. We will consider both the univariate and multivariate cases. For the univariate case we will first determine the distribution of the sample variance, and for the multivariate case we will first determine the distribution of the sample covariance matrix. When the component distributions of the mixture have different variances (univariate) or covariance matrices (multivariate), any analysis can become more challenging. In such cases, we propose approximating the mixture density with a non-mixture density from the same parametric family (e.g., multivariate Gaussian). Finally, we will present some extensions of this work to the field of dimension reduction.
Public Health Significance: Mixtures of continuous and discrete variables are somewhat common in public health settings (e.g., genetics, health services research), but statistical methods for the analysis of such data are not nearly as developed and robust, compared to the analysis of only one type of data (e.g., continuous). The methods developed in this dissertation could be used to expand inferential approaches to non-normal data which are commonly seen in public health settings. For example, hypothesis testing of the proportionate contribution of eigenvalues could be adapted to mixtures of different types of data, and these methods could possibly be extended to high-dimensional data (e.g., genetics) by examining mixtures of singular Wishart distributions.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
26 September 2018 |
Date Type: |
Publication |
Defense Date: |
30 May 2018 |
Approval Date: |
26 September 2018 |
Submission Date: |
24 July 2018 |
Access Restriction: |
1 year -- Restrict access to University of Pittsburgh for a period of 1 year. |
Number of Pages: |
144 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
School of Public Health > Biostatistics |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
mixture distributions; Wishart; |
Date Deposited: |
26 Sep 2018 14:07 |
Last Modified: |
01 Sep 2019 05:15 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/35090 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |