基于遗传算法的查询导向式自动文摘  

Query-Oriented Summarization Based on Genetic Algorithm

在线阅读下载全文

作  者:王海[1] 胡珀[2] 

机构地区:[1]武汉职业技术学院电子信息工程学院,湖北武汉430074 [2]华中师范大学计算机科学系,湖北武汉430079

出  处:《微计算机信息》2009年第28期23-25,共3页Control & Automation

摘  要:查询导向式自动文摘是近年来文本挖掘领域的一个热点研究课题,它以自动生成偏向用户查询需求的个性化简洁摘要为目的。本文从优化问题的角度提出一种基于遗传算法的句子抽取型文摘选择策略和方法,可以满足摘要长度限制的不同句子集合构成的随机摘要作为初始种群,将文摘的综合特性评价函数作为适应函数,通过遗传算法的全局寻优能力搜索到整体特性接近最优的句子集合作为摘要。该方法将摘要的查询偏好性与冗余性无缝地集成到遗传算法的适应函数中,因而能使生成的摘要具有更优的综合质量。在新浪网上随机抽取100个不同主题的新闻文本作为摘要测试文本,通过实验,验证了该策略和方法的有效性。Query-oriented summarization is a hot research issue in text mining, which aims to generate a query-biased concise summary in accordance with user needs. This paper proposes a sentence extractive summarization approach based on genetic algorithm from the perspective of optimization problem. In the method, different sentence sets constituting the random summaries and conforming to specific length limit are selected as the initial population and the evaluation function for a summary's comprehensive characteristics is considered as the fitness function. With the global optimization ability of genetic algorithm, the sentence set with the best overall performance is selected to create the summary. This method seamlessly integrates the query preference and redundancy into the fitness function of the genetic algorithm to ensure the created summary a better quality. Experimental results on one hundred of news documents with different topics randomly selected from Sina website have demonstrated the effectiveness of the proposed approach.

关 键 词:查询导向式自动文摘 遗传算法 句子抽取 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象