Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

INCLUSION OF 48 PACIFIC ISLANDERS WITHIN A COSMOPOLITAN REFERENCE PANEL IS SUFFICIENT FOR HIGH ACCURACY GENOTYPE IMPUTATION OF SAMOANS

Anderson, Kevin (2022) INCLUSION OF 48 PACIFIC ISLANDERS WITHIN A COSMOPOLITAN REFERENCE PANEL IS SUFFICIENT FOR HIGH ACCURACY GENOTYPE IMPUTATION OF SAMOANS. Master's Thesis, University of Pittsburgh. (Unpublished)

[img] PDF
Restricted to University of Pittsburgh users only until 10 May 2024.

Download (1MB) | Request a Copy

Abstract

Imputation is a computational method for inferring genotypes based on previous knowledge of shared haplotype structure commonly used in genome-wide association studies. Genotype fre-quencies not only play an important role in imputation but also are highly variable around the world, meaning it is crucial to adjust for population bias in genetic studies. Common methods for imputation involve the use of publicly available haplotype panels from 1000 Genomes, TOPMed, or other consortia. However, these panels contain data mostly pulled from individuals of Europe-an ancestry. Population isolates such as Polynesians greatly benefit in genotype accuracy when using a population-specific haplotype reference panel. Here, I perform multiple imputations using the 1000 Genomes phase III reference panel and genome-wide data from 1285, 384, 96, 48, 24, and 1 Samoan on chromosomes 5 and 21 to determine how many fully sequenced individuals are needed to include in study-specific haplotype panels to achieve accurate imputation. I also inves-tigated the accuracy of these multiple imputations on genotype frequencies of population-specific variants found in the CREBRF and BTNL9 genes that are previously determined to be associated with higher BMI and lower HDL levels respectively. I demonstrate that the incorporation of 96 Samoans within the 1000 Genomes cosmopolitan panel produces accurate imputation quality of rare variants (minor allele frequency of 1%), and 24 Samoans for common variants (minor allele frequency greater than 5%). These results show that the creation of a study-specific reference panel utilizing a small subset of individuals from a population-isolate within a cosmopolitan panel is a cost-effective strategy for accurate imputation. The ability to perform fine-mapping on rare population-specific variants will have broad public health implications such as better understand-ing of genetic disease etiology and function and improved genetic literacy when focusing on these population isolates.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Anderson, Kevinkja34@pitt.edukja34
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairCarlson, Jennajnc35@pitt.edujnc350000-0001-5483-0833
Committee ChairWeeks, Danielweeks@pitt.eduweeks0000-0001-9410-7228
Thesis AdvisorMinster, Ryanrminster@pitt.edurminster0000-0001-7382-6717
Date: 29 April 2022
Defense Date: 22 April 2022
Approval Date: 10 May 2022
Submission Date: 29 April 2022
Access Restriction: 2 year -- Restrict access to University of Pittsburgh for a period of 2 years.
Number of Pages: 79
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Human Genetics
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: word
Date Deposited: 10 May 2022 20:00
Last Modified: 30 Jun 2022 15:16
URI: http://d-scholarship.pitt.edu/id/eprint/42901

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item