Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy

Bekhuis, T (2006) Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy. Biomedical Digital Libraries, 3. ISSN 1742-5581

PDF (Review article with suggestions for further research)
Published Version
Available under License : See the attached license file.

Download (252kB) | Preview
[img] Plain Text (licence)
Available under License : See the attached license file.

Download (1kB)


Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians. © 2006Bekhuis; licensee BioMed Central Ltd.


Social Networking:
Share |


Item Type: Article
Status: Published
CreatorsEmailPitt UsernameORCID
Bekhuis, Ttcb24@pitt.eduTCB240000-0002-8537-9077
Date: 3 April 2006
Date Type: Publication
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Journal or Publication Title: Biomedical Digital Libraries
Volume: 3
DOI or Unique Handle: 10.1186/1742-5581-3-2
Institution: University of Pittsburgh
Schools and Programs: School of Information Sciences > Library and Information Science
School of Medicine > Biomedical Informatics
Refereed: Yes
ISSN: 1742-5581
Related URLs:
Article Type: Review
MeSH Headings: Medical Informatics; Medical Informatics Computing; Computational Biology; Data Mining; Databases, Bibliographic; Natural Language Processing
PubMed Central ID: PMC1459187
PubMed ID: 16584552
Date Deposited: 11 Sep 2012 21:39
Last Modified: 20 Nov 2018 16:55


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item