Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Towards the Construction of a Transcriptional Landscape of the Human Genome: Data Analysis and Data Compression

Lin, Yuefeng (2014) Towards the Construction of a Transcriptional Landscape of the Human Genome: Data Analysis and Data Compression. Master's Thesis, University of Pittsburgh. (Unpublished)

This is the latest version of this item.

[img]
Preview
PDF
Primary Text

Download (6MB) | Preview

Abstract

In the thesis, we built a genome-wide polyadenylation map with sequencing data sets from various human tissues and cell lines. With the map, we analyzed the pattern and distribution of polyadenylation sites in human genome. And we explored the differential polyadenylation patterns of non-coding and novel genes. Meanwhile, we have created the Expression and Polyadenylation Database (xPAD) as a web portal for the polyadenylation map. Moreover, we revealed the regulatory marks that might correlated with polyadenylation sites we have found.
Besides, we unveiled a novel group of small YB-1 associated RNAs and investigate their possible regulation mechanism where we found multiple transcription factors and histone modification may mark the location of YB-1 associated RNA.
We also implemented an Assembly-based Sequencing data Encoding Tool, AbSEnT. With this tool, we exhibited the feasibility and efficiency of the novel assembly-based compression algorithm by achieving a higher compression ratio than general-purpose compression tools. Meanwhile, we investigated the distribution of word frequency in sequencing data and found it shares a similarity with natural languages we used. If the connection could be proved, we may borrow the knowledge and experience from what we have learned in the research of natural language into the analysis of sequencing data.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Lin, Yuefengwindyue87@gmail.com
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Thesis AdvisorCamacho, Carlos J.ccamacho@pitt.eduCCAMACHO
Committee MemberZuckerman, Daniel M. ddmmzz@pitt.eduDDMMZZ
Committee MemberClark, Nathan L.nclark@pitt.eduNCLARK
Date: 28 October 2014
Date Type: Publication
Defense Date: 28 August 2014
Approval Date: 28 October 2014
Submission Date: 28 October 2014
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 131
Institution: University of Pittsburgh
Schools and Programs: School of Medicine > Computational and Systems Biology
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: Sequencing data analysis, alternative polyadenylation, Assembly-based seqeuncing data compression, small RNAs
Date Deposited: 28 Oct 2014 13:36
Last Modified: 15 Nov 2016 14:25
URI: http://d-scholarship.pitt.edu/id/eprint/23424

Available Versions of this Item

  • Towards the Construction of a Transcriptional Landscape of the Human Genome: Data Analysis and Data Compression. (deposited 28 Oct 2014 13:36) [Currently Displayed]

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item