Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Revealing functionally coherent subsets using a spectral clustering and an information integration approach

Richards, AJ and Schwacke, JH and Rohrer, B and Cowart, LA and Lu, X (2012) Revealing functionally coherent subsets using a spectral clustering and an information integration approach. BMC Systems Biology, 6 (SUPPL3).

[img]
Preview
PDF
Published Version
Available under License : See the attached license file.

Download (1MB) | Preview
[img] Plain Text (licence)
Available under License : See the attached license file.

Download (1kB)

Abstract

Background: Contemporary high-throughput analyses often produce lengthy lists of genes or proteins. It is desirable to divide the genes into functionally coherent subsets for further investigation, by integrating heterogeneous information regarding the genes. Here we report a principled approach for managing and integrating multiple data sources within the framework of graph-spectrum analysis in order to identify coherent gene subsets.Results: We investigated several approaches to integrate information derived from different sources that reflect distinct aspects of gene functional relationships including: functional annotations of genes in the form of the Gene Ontology, co-mentioning of genes in the literature, and shared transcription factor binding sites among genes. Given a list of genes, we construct a graph containing the genes in each information space; then the graphs were kernel transformed so they could be integrated; finally functionally coherent subsets were identified using a spectral clustering algorithm. In a series of simulation experiments, known functionally coherent gene sets were mixed and recovered using our approach.Conclusions: The results indicate that spectral clustering approaches are capable of recovering coherent gene modules even under noisy conditions, and that information integration serves to further enhance this capability. When applied to a real-world data set, our methods revealed biologically sensible modules, and highlighted the importance of information integration. The implementation of the statistical model is provided under the GNU general public license, as an installable Python module, at: http://code.google.com/p/spectralmix. © 2012 Richards et al; licensee BioMed Central Ltd.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: Article
Status: Published
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Richards, AJ
Schwacke, JH
Rohrer, B
Cowart, LA
Lu, Xxinghua@pitt.eduXINGHUA
Date: 17 December 2012
Date Type: Publication
Journal or Publication Title: BMC Systems Biology
Volume: 6
Number: SUPPL3
DOI or Unique Handle: 10.1186/1752-0509-6-s3-s7
Schools and Programs: School of Medicine > Biomedical Informatics
Refereed: Yes
Date Deposited: 02 Dec 2016 14:55
Last Modified: 23 Jan 2019 23:55
URI: http://d-scholarship.pitt.edu/id/eprint/29790

Metrics

Monthly Views for the past 3 years

Plum Analytics

Altmetric.com


Actions (login required)

View Item View Item