Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Investigation and implementation Gene signature development using microarray data – A case study on early stage non-small cell lung cancer

Huang, Ruiqi (2015) Investigation and implementation Gene signature development using microarray data – A case study on early stage non-small cell lung cancer. Master's Thesis, University of Pittsburgh. (Unpublished)

Submitted Version

Download (1MB)


Gene signature development using microarrays has started more than 15 years ago, yet there are still common mistakes made by researchers. The goal of this research is to investigate and implement gene signature using affymetrix array data. It aims to establish a working flow with well-justified steps for gene signature development.

Gene expression data from surgery samples of 62 early stage un-treated NSCLC patients in JBR10 trial was used for training model development. Individual genes were selected using univariate cox regression analysis, and then the gene set was summarized by principle components, which were then served as the inputs to the Cox regression model. A multi-layer internal validation was conducted for modeling evaluation. The performance of the gene signature was evaluated by testing on three independent data sets.

A signature of 88 genes was developed that can identify patients with significantly different survival prognosis (Hazard Ratio, 95% CI, P). The signature was successfully validated in independent datasets (Hazard Ratio, 95% CI, P; Hazard Ratio, 95% CI, P; Hazard Ratio, 95% CI, P).

A working flow of gene signature development composed of preliminary gene filtering, individual gene selection, predictive model construction using supervised principle component analysis and further internal/external validation, has been constructed.
Using gene expression of 62 patients from affymetrix array data in JBR.10 trials, an 88-gene signature was obtained and validated in independent datasets.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Huang, Ruiqiruh9@pitt.eduRUH9
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairWilson, Johnjww@pitt.eduJWW
Committee MemberHuang,
Committee MemberTang, Gonggot1@pitt.eduGOT1
Committee MemberMaria , MBROOKS
Date: 29 June 2015
Date Type: Publication
Defense Date: 23 April 2015
Approval Date: 29 June 2015
Submission Date: 10 April 2015
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 50
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: NSCLC, gene signature, Principle component analysis
Date Deposited: 29 Jun 2015 13:57
Last Modified: 15 Nov 2016 14:27


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item