Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Increasing consistency of disease biomarker prediction across datasets

Chikina, MD and Sealfon, SC (2014) Increasing consistency of disease biomarker prediction across datasets. PLoS ONE, 9 (4).

Published Version
Available under License : See the attached license file.

Download (1MB) | Preview
[img] Plain Text (licence)
Available under License : See the attached license file.

Download (1kB)


Microarray studies with human subjects often have limited sample sizes which hampers the ability to detect reliable biomarkers associated with disease and motivates the need to aggregate data across studies. However, human gene expression measurements may be influenced by many non-random factors such as genetics, sample preparations, and tissue heterogeneity. These factors can contribute to a lack of agreement among related studies, limiting the utility of their aggregation. We show that it is feasible to carry out an automatic correction of individual datasets to reduce the effect of such 'latent variables' (without prior knowledge of the variables) in such a way that datasets addressing the same condition show better agreement once each is corrected. We build our approach on the method of surrogate variable analysis but we demonstrate that the original algorithm is unsuitable for the analysis of human tissue samples that are mixtures of different cell types. We propose a modification to SVA that is crucial to obtaining the improvement in agreement that we observe. We develop our method on a compendium of multiple sclerosis data and verify it on an independent compendium of Parkinson's disease datasets. In both cases, we show that our method is able to improve agreement across varying study designs, platforms, and tissues. This approach has the potential for wide applicability to any field where lack of inter-study agreement has been a concern. © 2014 Chikina, Sealfon.


Social Networking:
Share |


Item Type: Article
Status: Published
CreatorsEmailPitt UsernameORCID
Sealfon, SC
ContributionContributors NameEmailPitt UsernameORCID
Date: 16 April 2014
Date Type: Publication
Journal or Publication Title: PLoS ONE
Volume: 9
Number: 4
DOI or Unique Handle: 10.1371/journal.pone.0091272
Schools and Programs: School of Medicine > Computational and Systems Biology
Refereed: Yes
Date Deposited: 23 Jun 2014 20:54
Last Modified: 29 Jan 2019 15:55


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item