Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

An Introductory Genomics Workflow for Exploring Publicly Available Infectious Disease Data

Vyas, Praveer S. (2022) An Introductory Genomics Workflow for Exploring Publicly Available Infectious Disease Data. Master Essay, University of Pittsburgh.

[img] Archive (TGZ) (Scripts for performing illustrative DE analysis in Linux)
Supplemental Material

Download (5kB)
[img] Other (Scripts for performing illustrative DE analysis on macOS)
Supplemental Material

Download (20kB)
[img] Archive (ZIP) (Scripts for performing illustrative DE analysis on Windows)
Supplemental Material

Download (5kB)
Download (662kB) | Preview


Advances in computational and gene sequencing technology provide an avenue for public health students and professionals who are interested in gaining exposure to biological research. Differential expression (DE) analysis can be performed using publicly available tools as well as data to learn more about the biological differences between samples from humans, animals or pathogens. There is a vast amount of publicly available gene expression data that can be searched to find a dataset related to a topic of interest. As an example, infectious disease epidemiology students could use their own computer to perform a DE analysis on an existing dataset related to a trend they studied or observed, without the need to enter a laboratory. DE analyses can be performed quickly on personal computers using pseudoalignment software, which is less computationally intensive and faster than alignment of RNA-seq reads to a reference genome. An algorithm for performing a DE analysis on an infectious disease topic utilizing pseudoalignment will be provided. Basic requirements for using this algorithm are a working knowledge of the statistical programming language R, familiarity with executing shell scripts and a general understanding of the central dogma of biology. This approach will provide the user with experience performing complex genomics analyses and further their professional development. An illustrative analysis related to the public health issue of progression of disease in individuals with latent tuberculosis infection will be provided. Direct applications of the results from this example will be discussed in addition to how individuals in public health may benefit from utilizing this algorithm and expanding their genomics skillset.


Social Networking:
Share |


Item Type: Other Thesis, Dissertation, or Long Paper (Master Essay)
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Vyas, Praveer S.psv9@pitt.edupsv9
ContributionContributors NameEmailPitt UsernameORCID
Committee ChairMartinson, Jeremy J.jmartins@pitt.edujmartinsUNSPECIFIED
Committee MemberNachega, Jean B.jbn16@pitt.edujbn16UNSPECIFIED
Date: 17 May 2022
Date Type: Completion
Submission Date: 29 April 2022
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 32
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Infectious Diseases and Microbiology
Degree: MPH - Master of Public Health
Thesis Type: Master Essay
Refereed: Yes
Uncontrolled Keywords: genomics, workflow, rna-seq, infectious-diseases
Date Deposited: 17 May 2022 16:27
Last Modified: 17 May 2022 16:27


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item