加权映射匹配方法的站内搜索引擎设计  

SITE SEARCH ENGINE DESIGN WITH WEIGHTED MAPPING METHOD

在线阅读下载全文

作  者:江文龙[1] 赵逢禹[1] 陈章[1] 

机构地区:[1]上海理工大学光电信息与计算机工程学院,上海200093

出  处:《计算机应用与软件》2016年第4期91-94,共4页Computer Applications and Software

摘  要:通用搜索引擎与网站提供的站内搜索机制都无法实现基于内容的企业网站信息查找。在分析企业网站信息的类型后,针对该问题提出一个通用站内搜索引擎架构。给出该引擎的设计思想,介绍对象映射匹配方法、加权对象相似度计算算法、索引构建等实现技术。实现基于网页内容、Word与pdf附件内容的查找定位。实验结果显示,该方法具有很高的查准率和查全率。该引擎可为企业网站的内容搜索与个性化服务提供支持。Neither the general search engine nor the site search mechanism provided by websites is able to achieve the content-based search of corporate websites information. After analysing the types of corporate websites information,we proposed a general site search engine architecture for this problem. Apart from discussing the design ideas of the engine,we also introduced the implementation techniques including the objects mapping and matching method,the algorithm of weighted objects similarity calculation,and the indexes construction,etc. The engine implements the search and positioning based on website contents and the attachment contents of Word and pdf. Experimental results showed that the search engine had high accuracy and recall rate. The engine could also serve the supports to content search and personalised services for corporate websites.

关 键 词:站内搜索 对象映射 附件内容 对象相似度 

分 类 号:TP319[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象