一种结合标签分类和语义查询扩展的文本素材推荐方法  被引量:1

Text Material Recommendation Method Combining Label Classification and Semantic Query Expansion

在线阅读下载全文

作  者:孟怡悦 彭蓉[1] 吕其标 MENG Yiyue;PENG Rong;LYU Qibiao(School of Computer Science,Wuhan University,Wuhan 430072,China)

机构地区:[1]武汉大学计算机学院,武汉430072

出  处:《计算机科学》2023年第1期76-86,共11页Computer Science

基  金:教育部-中国移动联合基金项目(MCM2020J01)。

摘  要:在各类规划、调研报告的编制过程中,编制人员往往需要根据拟定的目录或标题去收集、阅读大量文本素材,分类整理后再甄选使用,不仅工作量大而且质量无法得到保障。为此,在数字政府规划文档编制领域中提出了一种结合标签分类和语义查询扩展的文本素材推荐方法,从信息检索的角度出发,将目录中的各级标题视为查询语句,将参阅的文本素材作为目标文档,从而进行文本素材检索与推荐。该方法基于差分进化算法,将基于词向量平均的文本素材推荐方法、基于语义查询扩展的文本素材推荐方法和基于标签分类的文本素材推荐方法有机结合,弥补了传统的文本素材推荐方法的不足,实现了通过目录结构的标题检索以段落为粒度的文本素材。在10个数据集上的实验验证结果表明,该方法的性能提升显著,能够大大减少人工素材选择的工作量,同时减少素材分类的工作量,降低文档编制的难度。In the process of preparing various planning and research reports,researchers often need to collect and read a large amount of text materials according to the proposed catalog or title,not only the workload is large,but the quality cannot be gua-ranteed.To this end,in the field of digital government planning documentation,a text material recommendation method combining label classification and semantic query expansion is proposed.From the perspective of information retrieval,the titles at all levels in the catalog are regarded as query sentences,and the referenced text materials are used as target documents,so as to retrieve and recommend text materials.This method is based on the differential evolution algorithm,organically combining the text material recommendation method based on word vector average,semantic query expansion and label classification,which makes up the shortcoming of the traditional text material recommendation method and achieves to retrieve the text materials with the granularity of paragraphs through the title of catalog.After experimental verification on 10 datasets,the results show that the performance of the proposed method is significantly improved.It can greatly reduce the workload of manual material selection and material classification,as well as reduce the difficulty of documentation.

关 键 词:文本素材推荐 信息检索 数字政府 查询扩展 差分进化算法 

分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象