Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

A comparative study of different strategies of batch effect removal in microarray data: a case study of three datasets

Ding, Fei (2013) A comparative study of different strategies of batch effect removal in microarray data: a case study of three datasets. Master's Thesis, University of Pittsburgh. (Unpublished)

[img]
Preview
PDF
Primary Text

Download (3MB) | Preview

Abstract

Batch effects refer to the systematic non-biological variability that is introduced by experimental design and sample processing in microarray experiments. It is a common issue in microarray data and could introduce bias into the analysis, if ignored. Many batch effect removal methods have been developed. Previous comparative work has been focused on their effectiveness of batch effects removal and impact on downstream classification analysis. The most common type of analysis for microarray data is differential expression (DE) analysis, yet no study has examined the impact of these methods on downstream DE analysis, which identifies markers that are significantly associated with the outcome of interest. In this project, we investigated the performance of five popular batch effect removal methods, mean-centering, ComBat_p, ComBat_n, SVA, and ratio based methods, on batch effects reduction and their impact on DE analysis using three experimental datasets with different sources of batch effects. We found that the performance of these methods is data-dependent: simple mean-centering method performed reasonably well in all three datasets, but the more complicated algorithms such as ComBat method’s performance could be unstable for certain dataset and should be applied with caution. Given a new dataset, we recommend either using the mean-centering method or carefully investigating a few different batch removal methods and choosing the one that is the best for the data, if possible. This study has important public health significance because better handling of batch effect in microarray data can reduce biased results and lead to improved biomarker identification.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Ding, Feifed11@pitt.eduFED11
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Thesis AdvisorLin, Yanyal14@pitt.eduYAL14
Committee MemberArena, Vincent C.arena@pitt.eduARENA
Committee MemberThomas, Sufithomsm@upmc.edu
Date: 27 September 2013
Date Type: Publication
Defense Date: 7 June 2013
Approval Date: 27 September 2013
Submission Date: 5 June 2013
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 45
Institution: University of Pittsburgh
Schools and Programs: Graduate School of Public Health > Biostatistics
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: batch effect; microarray; DE analysis
Date Deposited: 27 Sep 2013 16:17
Last Modified: 15 Nov 2016 14:13
URI: http://d-scholarship.pitt.edu/id/eprint/18962

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item