检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:薛振宇 余正涛[1,2] 高盛祥[1,2] XUE Zhenyu;YU Zhengtao;GAO Shengxiang(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]昆明理工大学云南省人工智能重点实验室,昆明650500
出 处:《计算机工程》2022年第8期274-282,291,共10页Computer Engineering
基 金:国家自然科学基金(61972186,61762056,61472168);国家重点研发计划(2018YFC0830105,2018YFC0830101,2018YFC0830100);云南省重大科技专项(202002AD080001);云南省高科技人才项目(201606,202105AC160018);云南省基础研究计划(202001AS070014,2018FB104)。
摘 要:现有汉越跨语言新闻事件检索方法较少使用新闻领域内的事件实体知识,在候选文档中存在多个事件的情况下,与查询句无关的事件会干扰查询句与候选文档间的匹配精度,影响检索性能。提出一种融入事件实体知识的汉越跨语言新闻事件检索模型。通过查询翻译方法将汉语事件查询句翻译为越南语事件查询句,把跨语言新闻事件检索问题转化为单语新闻事件检索问题。考虑到查询句中只有单个事件,候选文档中多个事件共存会影响查询句和文档的精准匹配,利用事件触发词划分候选文档事件范围,减小文档中与查询无关事件的干扰。在此基础上,利用知识图谱和事件触发词得到事件实体丰富的知识表示,通过查询句与文档事件范围间的交互,提取到事件实体知识表示与词以及事件实体知识表示之间的排序特征。在汉越双语新闻数据集上的实验结果表明,与BM25、Conv-KNRM、ATER等基线模型相比,该模型能够取得较好的跨语言新闻事件检索效果,NDCG和MAP指标最高可提升0.7122和0.5872。The existing Chinese-Vietnamese cross-language news event retrieval methods are not sufficiently integrated into the knowledge of event entities in the news field.Furthermore,when there are multiple events in the candidate document,events unrelated to the query sentence interfere with the matching accuracy between the query sentence and the candidate documents,which affects retrieval performance.This study proposes a Chinese-Vietnamese cross-language news event retrieval model incorporating event entity knowledge.The query translation method is used to translate Chinese event query sentences into Vietnamese event query sentences,and the cross-language news event retrieval problem is transformed into a monolingual news event retrieval problem.Considering that there is only a single event in the query sentence,the coexistence of multiple events in the candidate document affects the exact match between the query sentence and the document.The event trigger word is used to divide the event range of the candidate document and to reduce the interference of events unrelated to the query in the document.On this basis,the knowledge graph and event trigger words are used to obtain the rich knowledge representation of event entities.Through the interaction between the query sentence and the document event scope,the ranking features between the knowledge representation of event entities and the knowledge representation of words and event entities are extracted.The experimental results on the Chinese-Vietnamese bilingual news dataset show that compared with baseline models such as BM25,Conv-KNRM,and ATER,the proposed model achieves better cross-language news event retrieval performance;furthermore,using the proposed model,the NDCG and MAP indicators can be improved by up to 0.7122 and 0.5872.
关 键 词:跨语言检索 事件实体 事件触发词 事件范围 排序学习 事件检索
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.145.171.144