Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Statistical Methods for Cellular Deconvolution with Single-Cell Omics

Cai, Manqi (2024) Statistical Methods for Cellular Deconvolution with Single-Cell Omics. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Download (6MB) | Preview


Cellular fractions confound the analyses of bulk omics data since tissues are mixtures of myriad cells. To deconfound tissue-level analyses, cellular deconvolution is used to estimate the proportions of cell types within bulk data and enable downstream analyses at cell-type resolution. The advent of single-cell technologies has catalyzed improvements in deconvolution references.

In this thesis, we first introduce an ensemble learning algorithm, EnsDeconv, to synthesize the estimates of cell-type deconvolution from various deconvolution scenarios. It is designed to combine selecting references, deconvolution methods, and data preprocessing. EnsDeconv incorporates cell type-specific optimizations to provide accurate and robust deconvolution results, as benchmarked on several large real bulk datasets in transcriptomics and epigenomics.

Reference is the most important factor in deconvolution. In Chapter 2, we improve the reference for DNA methylation with the emerging single-cell DNA methylation (scDNAm). Confronting the inherent challenges presented by the ultra-high dimensionality and excessive missingness of current scDNAm techniques, we present a novel workflow, scMD. It enables the construction of refined references from scDNAm data, surpassing conventional sorted-cell or RNA-imputed references in accuracy and precision.

Chapter 3 refines scMD with a latent multinomial cell-type allocation model and a binomial DNA methylation model, applicable to high-throughput sequencing and array-based experiments. This approach is implemented as an Expectation-Maximization (EM) algorithm. By integrating a joint modeling framework for heterogeneous tissue samples alongside reference scDNAm data, the methodology facilitates the concurrent estimation of tissue composition and updated cell-type-specific methylation profiles.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Cai, Manqimac538@pitt.edumac5380000-0003-3296-1730
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairWang, JiebiaoJBWANG@pitt.eduJBWANG
Committee MemberChen, Weiwei.chen@pitt.eduwei.chen
Committee MemberMcKennan, ChrisCHM195@pitt.eduCHM195
Committee MemberTseng, Georgectseng@pitt.eductseng
Date: 14 May 2024
Date Type: Publication
Defense Date: 18 April 2024
Approval Date: 14 May 2024
Submission Date: 25 April 2024
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 102
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: cellular deconvolution, single-cell data, epigenomics, transcriptomics.
Date Deposited: 14 May 2024 18:51
Last Modified: 14 May 2024 18:51


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item