中文电子病历中的时间关系识别  被引量:6

Recognition of temporal relation in Chinese electronic medical records

在线阅读下载全文

作  者:孙健 高大启[1] 阮彤[1] 殷亦超[2] 高炬[2] 王祺 

机构地区:[1]华东理工大学信息科学与工程学院,上海200237 [2]上海中医药大学附属曙光医院,上海200021

出  处:《计算机应用》2018年第3期626-632,共7页journal of Computer Applications

基  金:国家863计划项目(2015AA020107);国家科技支撑计划项目(2015BAH12F01-05)~~

摘  要:中文电子病历中的时间关系包括句内时间关系和句间时间关系,其中,句内时间关系包括句内事件-事件的时间关系和句内事件-时间的时间关系,句间时间关系即是句间事件-事件的时间关系。把中文电子病历文本中的时间关系识别转化成实体对分类问题,针对句内时间关系的识别,制定了高准确率的启发式规则,并设计了基本特征、短语句法特征、依存特征和其他特征,训练分类器缓解句内时间关系的识别错误;针对句间时间关系的识别,在高准确率的启发式规则之外,设计了基本特征、短语句法特征和其他特征,训练分类器减少句间时间关系的识别错误。实验结果表明,当分别使用支持向量机(SVM)、SVM和随机森林(RF)算法时,所提方法在句内事件-事件、句内事件-时间和句间事件-事件的时间关系识别上的效果最好,其F1值分别达到了84.0%、85.6%和63.5%。The temporal relation or temporal links (denoted by the TLink tag) in Chinese electronic medical records includes temporal relations within a sentence ( hereafter referred to as "within-sentence TLinks"), and between-sentence TLinks. Among them, within-sentence TLinks include event/event TLinks and event/time TLinks, and between-sentence TLinks include event/event TLinks. The recognition of temporal relation in Chinese electronic medical record was transformed into classification problem on entity pairs. Heuristic rules with high accuracy were developed and two different classifiers with basic features, phrase syntax, dependency features, and other features were trained to determine within-sentence TLinks. Apart from heuristic rules with high accuracy, basic features, phrase syntax, and other features were used to train the classifiers to determine between-sentence TLinks. The experimental results show that Support Vector Machine (SVM), SVM and Random Forest (RF) algorithms achieve the best performance of recognition on within-sentence event/event TLinks, within-sentence event/time TLinks and between-sentence event/event TLinks, with F1 -scores of 84.0%, 85.6% and 63.5% respectively.

关 键 词:时间关系识别 实体对分类 句内时间关系 句间时间关系 语言特征 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象