Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Feature engineering and a proposed decision-support system for systematic reviewers of medical evidence

Bekhuis, T and Tseytlin, E and Mitchell, KJ and Demner-Fushman, D (2014) Feature engineering and a proposed decision-support system for systematic reviewers of medical evidence. PLoS ONE, 9 (1).

Published Version
Available under License : See the attached license file.

Download (831kB) | Preview
[img] Plain Text (licence)
Available under License : See the attached license file.

Download (1kB)


Objectives: Evidence-based medicine depends on the timely synthesis of research findings. An important source of synthesized evidence resides in systematic reviews. However, a bottleneck in review production involves dual screening of citations with titles and abstracts to find eligible studies. For this research, we tested the effect of various kinds of textual information (features) on performance of a machine learning classifier. Based on our findings, we propose an automated system to reduce screeing burden, as well as offer quality assurance. Methods: We built a database of citations from 5 systematic reviews that varied with respect to domain, topic, and sponsor. Consensus judgments regarding eligibility were inferred from published reports. We extracted 5 feature sets from citations: alphabetic, alphanumeric +, indexing, features mapped to concepts in systematic reviews, and topic models. To simulate a two-person team, we divided the data into random halves. We optimized the parameters of a Bayesian classifier, then trained and tested models on alternate data halves. Overall, we conducted 50 independent tests. Results: All tests of summary performance (mean F3) surpassed the corresponding baseline, P<0.0001. The ranks for mean F3, precision, and classification error were statistically different across feature sets averaged over reviews; P-values for Friedman's test were .045, .002, and .002, respectively. Differences in ranks for mean recall were not statistically significant. Alphanumeric+ features were associated with best performance; mean reduction in screening burden for this feature type ranged from 88% to 98% for the second pass through citations and from 38% to 48% overall. Conclusions: A computer-assisted, decision support system based on our methods could substantially reduce the burden of screening citations for systematic review teams and solo reviewers. Additionally, such a system could deliver quality assurance both by confirming concordant decisions and by naming studies associated with discordant decisions for further consideration. © 2014 Bekhuis et al.


Social Networking:
Share |


Item Type: Article
Status: Published
CreatorsEmailPitt UsernameORCID
Bekhuis, Ttcb24@pitt.eduTCB240000-0002-8537-9077
Tseytlin, Etseytlin@pitt.eduTSEYTLIN
Mitchell, KJkjm84@pitt.eduKJM84
Demner-Fushman, D
ContributionContributors NameEmailPitt UsernameORCID
Date: 27 January 2014
Date Type: Publication
Journal or Publication Title: PLoS ONE
Volume: 9
Number: 1
DOI or Unique Handle: 10.1371/journal.pone.0086277
Schools and Programs: School of Medicine > Biomedical Informatics
Refereed: Yes
Article Type: Review
Date Deposited: 23 Jun 2014 21:11
Last Modified: 22 Jun 2021 14:55


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item