深度学习与关联模式挖掘融合的查询扩展  

Combining Deep Learning and Association Patterns Mining for Query Expansion

在线阅读下载全文

作  者:黄名选[1,2] 胡小春[2] HUANG Ming-xuan;HU Xiao-chun(Guangxi Key Laboratory of Cross-border E-commerce Intelligent Information Processing,Guangxi University of Finance and Economics,Nanning 530003,China;School of Information and Statistics,Guangxi University of Finance and Economics,Nanning 530003,China)

机构地区:[1]广西财经学院广西跨境电商智能信息处理重点实验室,南宁530003 [2]广西财经学院信息与统计学院,南宁530003

出  处:《小型微型计算机系统》2022年第6期1293-1302,共10页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(61762006)资助.

摘  要:本文提出一种深度学习与关联模式挖掘融合的查询扩展模型.该模型采用基于Copulas函数的支持度-置信度评价框架挖掘初检伪相关反馈文档集中扩展词,构建统计扩展词集,利用深度学习工具对初检文档集进行词向量语义学习训练得到词向量扩展词集,将统计扩展词集和词向量扩展词集融合得到最终扩展词.该模型不仅考虑来自统计分析与挖掘的扩展词与原查询间的关联信息,还考虑扩展词在文档中的上下文语义信息,扩展词质量得到较好地改善.在NTCIR-5 CLIR语料的实验结果表明,本文扩展模型能提高信息检索性能,其MAP和P@5平均增幅高于近年现有同类查询扩展方法.本文扩展模型可用于跨语言检索系统,以提高其性能.A query expansion model which combines deep learning and association patterns mining is proposed in this paper.The proposed model uses the Support-Confidence evaluation framework based on copulas function to mine the expansion terms in the pseudo-relevance feedback document set returned for the original query,constructing the Statistical Expansion Term Set(SETS),and training word embedding semantic learning with the help of the deep learning tool to obtain the Word Embedding Expansion Term Set(WEETS).The final expansion terms are attained by combining the SETS and the WEETS.The model not only takes into account the relevant information between expansion terms,which comes from statistical analysis and mining,and original queries,but also considers the context semantic information of expansion terms in the document,which improves the quality of expansion terms.The experimental results on the NTCIR-5 CLIR corpus show that the proposed expansion model can improve the performance of information retrieval.The maximum average growth rate of the MAP and P@5 of the proposed expansion model is higher than that of the existing similar query expansion methods in recent years.The proposed expansion model can be used in cross-language retrieval systems to improve their performance.

关 键 词:自然语言处理 信息检索 文本挖掘 查询扩展 词嵌入 深度学习 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象