基于名实体的新闻专题自动综述系统研究与实现  被引量:2

Research and Implementation of Automatic News Summarization Based on Named Entity

在线阅读下载全文

作  者:索红光[1] 安迪[1] 李健[1] 

机构地区:[1]中国石油大学计算机与通信工程学院,东营257061

出  处:《情报学报》2010年第1期32-37,共6页Journal of the China Society for Scientific and Technical Information

摘  要:自动综述是指针对特定的主题进行多文档自动摘要,最终提供简洁、重要的信息。新闻专题自动综述是多文档自动摘要的一种应用形式,它可以帮助人们快速了解某个新闻事件的概貌。提出了一种基于名实体的新闻专题自动综述方法。该方法首先从新闻专题的文章集合中识别并挑选出代表新闻要素的时间、地点、人物、机构等名实体,经过语义处理后进行名实体的频率统计。然后根据句子中名实体的频率,结合句子位置、长度等因素计算句子的综合权值选出摘要句,最后根据句子的时间戳信息对句子排序输出得到最终的新闻专题综述。实验结果表明,该方法是有效的,具有实用价值。Automatic summarization abstracts simple and important information from documents on a specific subject. News automatic summarization is an application of multi-document summarization, which can help us to grasp the news' general quickly. A method of news automatic summarization based on named entity is proposed. First of all, named entities, such as time, location, person and organization, which present elements of news, are identified and picked out from the news documents, and then they are counted after semantic analysis. The weights of each sentence are calculated according to the frequency of the named entities, the position of the sentence and the length of the sentence. Several sentences are chosen according to their weights as summarization. Finally sentences are sorted by time information in them. Experimental results indicated that the method was effective and applicable in practice.

关 键 词:自动综述 多文档自动摘要 名实体 

分 类 号:G210.7[文化科学—新闻学] G353.11

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象