检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘高军[1] 方晓[1] 段建勇[1] LIU Gaojun;FANG Xiao;DUAN Jianyong(College of Information Science,North China University of Technology,Beijing 100144,China)
出 处:《计算机应用》2020年第11期3192-3197,共6页journal of Computer Applications
基 金:国家自然科学基金资助项目(61972003);CNONIX国家标准应用与推广实验室资助项目(4020548420G3)。
摘 要:随着互联网时代的到来,搜索引擎开始被普遍使用。在针对冷门数据时,由于用户的搜索词范围过小,搜索引擎无法检索出需要的数据,此时查询扩展系统可以有效辅助搜索引擎来提供可靠服务。基于全局文档分析的查询扩展方法,提出结合神经网络模型与包含语义信息的语料的语义相关模型,来更深层地提取词语间的语义信息。这些深层语义信息可以为查询扩展系统提供更加全面有效的特征支持,从而分析词语间的可扩展关系。在近义词林、语言知识库“HowNet”义原标注信息等语义数据中抽取局部可扩展词分布,利用神经网络模型的深度挖掘能力将语料空间中每一个词语的局部可扩展词分布拟合成全局可扩展词分布。在与分别基于语言模型和近义词林的查询扩展方法对比实验中,使用基于语义相关模型的查询扩展方法拥有较高的查询扩展效率;尤其针对冷门搜索数据时,语义相关模型的查全率比对比方法分别提高了11.1个百分点与5.29个百分点。With the advent of the Internet era,search engines begin to be widely used.In the case of unpopular data,the search engine is unable to retrieve the required data due to the small range of the user’s search term.At this time,the query extension system can effectively assist the search engine to provide the reliable services.Based on the query extension method of global document analysis,a semantic relevance model which combines the neural network model with the corpus containing semantic information was proposed to extract semantic information between words in a deeper level.This deep semantic information can provide more comprehensive and effective feature support for the query extension system,so as to analyze the extensible relationship between words.The local extensible word distribution was extracted from the semantic data such as thesaurus and language knowledge base“HowNet”sememe annotation information,and the local extensible word distribution of each word in corpus space was fitted to the global extensible word distribution by using the deep mining ability of the neural network model.In the comparison experiment with the query extension methods based on language model and thesaurus respectively,the query extension method based on semantic relevance model has a higher query extension efficiency;especially for the unpopular search data,the recall rate of semantic relevance model increases by 11.1 percentage points and 5.29 percentage points compared to those of the comparison methods respectively.
关 键 词:查询扩展 语义相关度 深度学习 全局文档分析 语言模型
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.191.254.28