检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄名选[1,2] 胡小春[2] HUANG Mingxuan;Hu Xiaochun(Guangxi Key Laboratory of Cross-Border E-commerce Intelligent Information Processing,Guangxi University of Finance and Economics,Nanning 530003;School of Information and Statistics,Guangxi University of Finance and Economics,Nanning 530003)
机构地区:[1]广西财经学院,广西跨境电商智能信息处理重点实验室,南宁530003 [2]广西财经学院信息与统计学院,南宁530003
出 处:《模式识别与人工智能》2021年第2期176-187,共12页Pattern Recognition and Artificial Intelligence
基 金:国家自然科学基金项目(No.61762006)资助。
摘 要:将Copulas理论引入文本特征词关联模式挖掘,提出融合Copulas理论和关联规则挖掘的查询扩展算法.从初检文档集中提取前列n篇文档构建伪相关反馈文档集或用户相关反馈文档集,利用基于Copulas理论的支持度和置信度对相关反馈文档集挖掘含有原查询词项的特征词频繁项集和关联规则模式,从这些规则模式中提取扩展词,实现查询扩展.在NTCIR-5 CLIR中英文本语料上的实验表明,文中算法可有效遏制查询主题漂移和词不匹配问题,改善信息检索性能,提升扩展词质量,减少无效扩展词.The Copulas theory is introduced into the association pattern mining of text feature terms,and a query expansion algorithm combining Copulas theory and association rules mining is proposed.Firstly,top n documents of the document set returned by the query are extracted to construct the pseudo-relevance feedback document set(PRFDS)or user relevance feedback document set(URFDS).Then,the support and the confidence based on Copulas theory are applied to mine the feature term frequent itemsets and association rule patterns with the original query terms in PRFDS or URFDS,and the expansion terms are obtained from the patterns to realize query expansion.The experimental results on NTCIR-5 CLIR Chinese and English corpus show that the proposed expansion algorithm effectively restrains the problems of query topic drift and word mismatch,and enhances the performance of information retrieval with the quality of expansion terms improved and the invalid expansion terms reduced.
关 键 词:自然语言处理 查询扩展 信息检索 关联规则 文本挖掘
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117