基于矩阵加权关联规则的跨语言查询译后扩展  被引量:10

Cross Language Query Post-Translation Expansion Based on Matrix-Weighted Association Rules

在线阅读下载全文

作  者:黄名选[1,2] 蒋曹清 何冬蕾[1,2] HUANG Mingxuan1;2;JIANG Caoqing1;2;HE Donglei1;2

机构地区:[1]广西跨境电商智能信息处理重点实验室培育基地(广西财经学院),南宁530003 [2]广西财经学院信息与统计学院,南宁530003

出  处:《模式识别与人工智能》2018年第10期887-898,共12页Pattern Recognition and Artificial Intelligence

基  金:国家自然科学基金项目(No.61762006;61662003;61262028)资助~~

摘  要:首先提出矩阵加权项集支持度计算方法,给出面向跨语言查询扩展的矩阵加权关联模式挖掘算法.然后提出基于矩阵加权关联规则挖掘的跨语言查询译后扩展算法.借助机器翻译进行首次跨语言检索,得到前列初检文档,并经用户相关性判断后得到相关反馈文档.通过计算支持度从相关反馈文档中挖掘含有原查询词的矩阵加权频繁项集,通过置信度-兴趣度评价框架从频繁项集中提取含有原查询词的关联规则,将规则的后件或前件作为扩展词,利用规则的置信度和兴趣度衡量扩展词的重要性,完成跨语言查询译后扩展.在NTCIR-5 CLIR标准测试集上的实验表明,文中算法可以有效提升跨语言查询扩展性能,有利于长查询的跨语言检索,译后后件扩展性能优于前件.A computing method for matrix-weighted itemset support is proposed firstly, and the algorithm of matrix-weighted association patterns mining for cross-language query expansion is presented. Then, the algorithm of cross-language query post-translation expansion is put forward based on matrix- weighted association rules mining. The first cross-language retrieval is performed to obtain the top initially retrieved documents(TIRDs) by machine translation, and the relevance feedback documents(RFDs) are gained from TIRDs by user correlation judgment. The matrix-weighted frequent itemsets containing original query terms are mined from RFDs by means of computing support and the association rules with original query terms are extracted from frequent itemsets according to the evaluation framework of confidence-interest. To implement cross-language query post-translation expansion, the consequents or antecedents of the rules are treated as expansion terms and the importance of the expansion terms is measured by the confidence and interest of the rule. Experiments on NTCIR-5 CLIR standard test set show that the proposed algorithm improves the performance of cross-language query expansion, and it is beneficial in cross-language retrieval of long queries. The performance of post-translation consequent expansion is better than that of the antecedent one.

关 键 词:矩阵加权关联模式 关联规则 查询扩展 跨语言信息检索 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象