Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Conceptualization of molecular findings by mining gene annotations

Chen, V and Lu, X (2013) Conceptualization of molecular findings by mining gene annotations. BMC Proceedings, 7.

Published Version
Available under License : See the attached license file.

Download (2MB) | Preview
[img] Plain Text (licence)
Available under License : See the attached license file.

Download (1kB)


Background: The Gene Ontology (GO) is an ontology representing molecular biology concepts related to genes and their products. Current annotations from the GO Consortium tend to be highly specific, and contemporary genome-scale studies often return a long list of genes of potential interest, such as genes in a cancer tumor that are differentially expressed than those found in normal tissue. It is therefore a challenging task to reveal, at a conceptual level, the major functional themes in which genes are involved. Presently, there is a need for tools capable of revealing such themes through mining and representing semantic information in an objective and quantitative manner. Methods: In this study, we utilized the hierarchical organization of the GO to derive a more abstract representation of the major biological processes of a list of genes based on their annotations. We cast the task as follows: given a list of genes, identify non-disjoint, functionally coherent subsets, such that the functions of the genes in a subset are summarized by an informative GO term that accurately captures the semantic information of the original annotations. Results: We evaluated different metrics for assessing information loss when merging GO terms, and different statistical schemes to assess the functional coherence of a set of genes. We found that the best discriminative power was achieved by using a combination of the information-content-based measure as the information-loss metric, and the graph-based statistics derived from a Steiner tree connecting genes in an augmented GO graph. Conclusions: Our methods provide an objective and quantitative approach to capturing the major directions of gene functions in a context-specific fashion.


Social Networking:
Share |


Item Type: Article
Status: Published
CreatorsEmailPitt UsernameORCID
Chen, Vvic14@pitt.eduVIC14
Lu, Xxinghua@pitt.eduXINGHUA
Date: 20 December 2013
Date Type: Publication
Journal or Publication Title: BMC Proceedings
Volume: 7
DOI or Unique Handle: 10.1186/1753-6561-7-s7-s2
Schools and Programs: School of Medicine > Biomedical Informatics
Refereed: Yes
Date Deposited: 02 Dec 2016 18:31
Last Modified: 30 Mar 2021 09:55


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item