中文新闻关键事件的主题句识别  被引量:18

Identification of Topic Sentence about Key Event in Chinese News

在线阅读下载全文

作  者:王伟[1,2] 赵东岩[1,3] 赵伟[1] 

机构地区:[1]北京大学计算科学与技术研究所,北京100871 [2]武警工程学院电子技术系,西安710086 [3]计算语言学教育部重点实验室,北京100871

出  处:《北京大学学报(自然科学版)》2011年第5期789-796,共8页Acta Scientiarum Naturalium Universitatis Pekinensis

基  金:国家自然科学基金(61003009);北京市科委基金(Z101101005010003);高等学校博士学科点专项科研基金(20100001120029)资助

摘  要:提出在单文档中通过提取主题句以获取关键事件信息的思想。根据新闻的体裁特点,分析了新闻报道与事件的关系,以及新闻标题在内容、形式和语言方面的特征。提出利用标题的提示性信息提取主题句来描述新闻关键事件的方法。该方法首先对新闻标题按信息含量进行分类,然后结合新闻句子的词频、长度、位置、与标题的相似度等特征计算句子的重要性。实验表明,该方法能够准确提取新闻主题句,为进一步抽取事件信息打好了基础。The authors propose an approach to extract topic sentences that describe key event from a news article.Considering the special structure of news articles,the relations between news articles and key events reported in them is studied,as well as the characteristics of a news headline in three aspects: information,form and language.A novel method based on the information aspect of a headline is used to extract a topic sentence which contains the key event information from a news story.The method first classifies a news headline as informative or non-informative,and then considers text and semantic features of a sentence,such as word frequency,sentence length,location in the text and word co-concurrency with the headline,to evaluate the importance for each sentence and select the most important one as the topic sentence.Experiment results show that this method can identify a topic sentence accurately and the proposed approach makes a good preparation for event information extraction.

关 键 词:计算机应用 中文信息处理 自然语言处理 自动文摘 事件抽取 新闻标题 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象