基于语义图的医学多文档摘要提取模型构建  被引量:11

Constructing Semantic Graph Based on Summary Extracting Model for Multiple Medical Documents

在线阅读下载全文

作  者:张晗[1] 赵玉虹[1] 

机构地区:[1]中国医科大学医学信息学院,沈阳110122

出  处:《图书情报工作》2017年第8期112-119,共8页Library and Information Service

基  金:教育部人文社会科学研究青年基金项目"基于语义述谓网络属性的多文档自动摘要:以生物医学为例"(项目编号:13YJC870030)研究成果之一

摘  要:[目的/意义]针对医学文本的特点,提出一种基于语义图的多文档自动摘要方法,并利用其中的语义信息实现摘要主题的识别。[方法/过程]利用SemRep实现源文档概念及其语义关系的规范化抽取并构建语义图,从概念-关系-社区3个层次对网络图中的关键信息进行抽取并生成摘要,利用概念-语义类型-类型分组三级映射实现对概念的归类,结合语义搭配模式对摘要主题进行划分。[结果/结论]通过对5种疾病数据集进行测试,结果显示该方法能有效识别出文献集中的核心内容,语义图中所富含的语义信息能准确地对摘要进行主题划分。[ Purpose/significance ] Addressing the special features of medical text, this paper proposed a method for multidocument automatic summarization based on semantic graph. By taking advantage of the semantics in the graph, it identified the themes in the summary. [ Method/process ] SemRep was used to extract the standard concepts and semantic relations from medical documents, which then were used to construct the semantic graph. Subsequently, core concept, semantic relations and communities were sequentially extracted from the graph to compose the summary, and the mappings between concepts and semantic types as well as between semantic types and semantic groups were used to group concepts macroscopically. Schemas were defined to identify the themes in the summary. [ Result/conclusion] Five datasets on diseases were used for testing and the results showed the method could effectively extract the core content from the documents. The semantics enriched in the graph could be used to precisely recognize the themes for the summary.

关 键 词:CLIQUE 语义图 多文档自动摘要 主题识别 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象