Zhang, Danchen and He, Daqing and Zhao, Sanqiang and Li, Lei
(2017)
Enhancing Automatic ICD-9-CM Code Assignment for Medical Texts with PubMed.
In: BioNLP Workshop 2017.
(In Press)
Abstract
Assigning a standard ICD-9-CM code to disease symptoms in medical texts is an important task in the medical domain. Automating this process could greatly reduce the costs. However, the effectiveness of an automatic ICD-9-CM code classifier faces a serious problem, which can be triggered by unbalanced training data. Frequent diseases often have more training data, which helps its classification to perform better than that of an infrequent disease. However, a disease’s frequency does not necessarily reflect its importance. To resolve this training data shortage problem, we propose to strategically draw data from PubMed to enrich the training data when there is such need. We validate our method on the CMC dataset, and the evaluation results indicate that our method can significantly improve the code assignment classifiers' performance at the macro-averaging level.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |