融合Copulas理论和关联规则挖掘的查询扩展  

Query Expansion Combining Copulas Theory and Association Rules Mining

在线阅读下载全文

作  者:黄名选[1,2] 胡小春[2] HUANG Mingxuan;Hu Xiaochun(Guangxi Key Laboratory of Cross-Border E-commerce Intelligent Information Processing,Guangxi University of Finance and Economics,Nanning 530003;School of Information and Statistics,Guangxi University of Finance and Economics,Nanning 530003)

机构地区:[1]广西财经学院,广西跨境电商智能信息处理重点实验室,南宁530003 [2]广西财经学院信息与统计学院,南宁530003

出  处:《模式识别与人工智能》2021年第2期176-187,共12页Pattern Recognition and Artificial Intelligence

基  金:国家自然科学基金项目(No.61762006)资助。

摘  要:将Copulas理论引入文本特征词关联模式挖掘,提出融合Copulas理论和关联规则挖掘的查询扩展算法.从初检文档集中提取前列n篇文档构建伪相关反馈文档集或用户相关反馈文档集,利用基于Copulas理论的支持度和置信度对相关反馈文档集挖掘含有原查询词项的特征词频繁项集和关联规则模式,从这些规则模式中提取扩展词,实现查询扩展.在NTCIR-5 CLIR中英文本语料上的实验表明,文中算法可有效遏制查询主题漂移和词不匹配问题,改善信息检索性能,提升扩展词质量,减少无效扩展词.The Copulas theory is introduced into the association pattern mining of text feature terms,and a query expansion algorithm combining Copulas theory and association rules mining is proposed.Firstly,top n documents of the document set returned by the query are extracted to construct the pseudo-relevance feedback document set(PRFDS)or user relevance feedback document set(URFDS).Then,the support and the confidence based on Copulas theory are applied to mine the feature term frequent itemsets and association rule patterns with the original query terms in PRFDS or URFDS,and the expansion terms are obtained from the patterns to realize query expansion.The experimental results on NTCIR-5 CLIR Chinese and English corpus show that the proposed expansion algorithm effectively restrains the problems of query topic drift and word mismatch,and enhances the performance of information retrieval with the quality of expansion terms improved and the invalid expansion terms reduced.

关 键 词:自然语言处理 查询扩展 信息检索 关联规则 文本挖掘 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象