基于语言规则的病症菌实体抽取  被引量:9

Disease and Bacteria Entity Extraction Based on Linguistic Rule

在线阅读下载全文

作  者:许华[1,2] 刘茂福[1,2] 姜丽[1,2] 顾进广[1,2] 

机构地区:[1]武汉科技大学计算机科学与技术学院,湖北武汉430065 [2]智能信息处理与实时工业系统湖北省重点实验室,湖北武汉430065

出  处:《武汉大学学报(理学版)》2015年第2期151-155,共5页Journal of Wuhan University:Natural Science Edition

基  金:国家自然科学基金(61100133);国家社会科学基金重大项目(11&Z189)资助项目

摘  要:实体抽取在自然语言处理领域中已经相当成熟;随着电子医疗文本急剧增加,医疗实体抽取在医疗领域的应用越来越受到关注.然而,针对医疗领域的专业术语,通用实体抽取方法普遍存在准确率不高的问题.针对药品说明书中的疾病、症状和致病菌,本文采用语言规则的方法,对其进行抽取并评价其准确性.首先,根据已有的术语表分词、词性标注并进行实体抽取;其次,根据语言规则识别医疗实体,从而提高实体抽取的准确率.实验结果显示各类医疗实体抽取的准确率可达80%以上.The entity extraction has already been quite mature in the area of natural language processing. With the dramatic increase of electronic medical texts,more and more attention have been paid on the applications of medical entity extraction in the medical field. However,for the terminology in the medical field,the accuracy of generic entity extraction is not high. This paper uses the method of linguistic rules to extract diseases,symptoms and pathogens in dispensatory and evaluate the accuracy of the system. According to the existing vocabulary,part of speech tagger will conduct the initial entity extraction. And then,the medical terminology will be enriched by the linguistic rules,so it can further improve the accuracy of the medical entity extraction. The experimental results show that the accuracy of medical entities,such as diseases,symptoms and pathogens,is more than 80% and the approach proposed by this paper is efficient and effective.

关 键 词:实体抽取 医疗领域 语言规则 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象