A pipeline for classifying close family relationships with dense SNP data and putative pedigree information

Zeng, Zhen (2015) A pipeline for classifying close family relationships with dense SNP data and putative pedigree information. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Submitted Version
Download (2MB)

Abstract

When genome-wide association studies (GWAS) or sequencing studies are performed on family-based datasets, the genotype data can be used to check the structure of putative pedigrees. Even in datasets of putatively unrelated people, close relationships can often be detected using dense single-nucleotide polymorphism/variant (SNP/SNV) data.
A number of methods for finding relationships using dense genetic data exist, but they all have certain limitations, including that they typically use average genetic sharing, which is only a subset of the available information. We present a set of approaches for classifying relationships in GWAS datasets or whole genome sequencing datasets. We first propose an empirical method for detecting identity-by-descent segments in close relative pairs using unphased dense SNP data and demonstrate how that information can assist in building a relationship classifier. We then develop a strategy to take advantage of putative pedigree information to enhance classification accuracy. Our methods are tested and illustrated with two SNP array datasets from two distinct populations. With these new techniques, we propose classification pipelines for checking and identifying pair-wise relationships in datasets containing a large number of small pedigrees.
We also explore the performance of the pipeline on a whole exome sequencing dataset. Although the classifier based on SNP array data does not perform well on exome sequencing data, it can in principle be modified using new algorithm parameters and training data in order to achieve better performance.
Finally, we develop a method to reconstruct pedigrees from pair-wise relationship information. Our method can reconstruct core pedigrees with high accuracy and pair-wise relationship inferences can be further improved during this process.
Detecting close family relationships and reconstructing pedigrees are important in both population-based and family-based studies. Providing precise pedigrees and hidden relatedness information helps increase the accuracy and power of various genetic analyses and avoids false positive associations, making these studies more efficient in identifying the genetic basis of diseases. This is a crucial step on the path to developing better treatments and interventions and improving public health.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Zeng, Zhen	zhz43@pitt.edu	ZHZ43

ETD Committee:

Title	Member	Email Address	Pitt Username	ORCID
Committee Chair	Feingold, Eleanor	feingold@pitt.edu	FEINGOLD
Committee Member	Weeks, Daniel E.	weeks@pitt.edu	WEEKS	0000-0001-9410-7228
Committee Member	Tseng, George C.	ctseng@pitt.edu	CTSENG
Committee Member	Chen, Wei	weichen.mich@gmail.com

Date:

28 September 2015

Date Type:

Publication

Defense Date:

6 May 2015

Approval Date:

28 September 2015

Submission Date:

12 June 2015

Access Restriction:

2 year -- Restrict access to University of Pittsburgh for a period of 2 years.

Number of Pages:

Institution:

University of Pittsburgh

Schools and Programs:

School of Public Health > Biostatistics

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

family relationships, IBD, GWAS, sequencing studies, classification, pedigree reconstruction

Date Deposited:

28 Sep 2015 16:59

Last Modified:

30 Jun 2022 15:52

URI:

http://d-scholarship.pitt.edu/id/eprint/25384

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

A pipeline for classifying close family relationships with dense SNP data and putative pedigree information

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds