检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]复旦大学计算机科学技术学院,上海200433
出 处:《计算机工程》2014年第10期25-31,共7页Computer Engineering
基 金:国家自然科学基金资助项目(60773075)
摘 要:目前可扩展标示语言(XML)关键字查询大多是基于最小公共祖先(LCA)语义子树产生查询结果,而未能加入除LCA语义子树之外与用户查询意图相关的结果。为解决该问题,提出一种基于扩展查询表达式的XML关键字查询方法。将用户查询日志作为查询扩展统计模型,对其进行统计分析,并结合最佳检索概念判断是否需要扩展查询表达式。使用XML TF-IDF方法计算候选属性的权重,根据初检结果的上下文信息,利用聚类方法获得与查询意图最相关的扩展查询关键字,从而扩展查询表达式。实验结果表明,与XSeek和基于语义词典的查询扩展方法相比,该方法的平均F度量值分别提高了7%和17%,具有较高的查询质量。Most existing eXtensible Markup Language ( XML ) keyword searches are based on Lowest Common Ancestor( LCA) semantics tree to generate search result,but they do not consider the data which is not included in LCA semantics tree while is relevant with user search intention. To solve this problem,an XML keyword query method based on extended query expression is proposed. The query expansion statistical model is based on user query log. Through analyzing query log and combined with optimal retrieval concept,it can judge whether the query expression should be expanded. After that,an XML TF-IDF method is employed to calculate the weight of candidate attribute. According to the context information and using cluster method,it gets the query expression keywords which are most relevant with search intention. Then the expanded query expression is generated. Compared with XSeek and semantics dictionary based query expression method,experimental result shows this method can improve the query quality by average 7% and 17% in F-measure respectively.
关 键 词:信息检索 可扩展标示语言 最小公共祖先语义 关键字查询 查询扩展 上下文信息
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28