一种多特征融合的Web医学信息语义关系抽取方法  

An approach to relation extraction in the area of medical information on web based on multi-feature fusion

在线阅读下载全文

作  者:龙丽英[1] 闫健卓[1] 方丽英[1] 李鹏英[1] 刘欣悦[1] 

机构地区:[1]北京工业大学电子信息与控制工程学院,北京100124

出  处:《北京生物医学工程》2016年第3期243-248,共6页Beijing Biomedical Engineering

摘  要:目的为给用户提供更为相关、整体和结构化的Web医学信息,提出一种多特征融合的语义关系抽取方法,以解决中文Web医学信息中两两医学实体之间语义关系的抽取。方法首先在混合句法分析算法的基础上构造包含词项、语义、词性、交互词、实体对距离、实体类别以及最短依赖关系特征的特征向量并结合支持向量机实现。对Web医学信息中师徒关系、擅长关系及从属关系抽取实验,比较在不同句法分析下、不同特征作用及不同机器学习算法下的语义关系抽取效果。结果从F估计和算法运行时间来看,混合句法分析下效果最佳。随着特征的加入,抽取效果不断提升,最后,对三类语义关系抽取最终获得81.16%、95.94%和86.16%的F估计值。结论基于多特征融合的语义关系抽取方法对于Web医学信息语义关系的抽取具有很好的效果。Objective To provide more related,holistic and structured result for users by using the information extraction technology for the request of medical information on web. Methods This paper describes an approach to relation extraction in the area of medical information on web based on multi-feature fusion,in which the support vector machine algorithm combines with the feature vectors constructed by lexicon,semantics,part of speech,interactive,distance,entity type and shortest dependency relation path based on mixed parsing algorithm. This paper compares the results of relation extraction on different parsing,different features and different machine learning algorithm. Results From the view of F-measure and running time,the result of mixing parsing is perfect. By adding different feature,the results are promoted continually and finally the F-measure of the three relation extraction is 81. 16%,95. 94% and 86. 16%,separately. Conclusions The approach to relation extraction in the area of medical information on web based on multi-feature fusion has a good performance.

关 键 词:Web医学信息 语义关系抽取 多特征 混合句法分析 支持向量机 

分 类 号:R318.04[医药卫生—生物医学工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象