基于文本语义的SA-LDA增量爬取图书选择与推介  

SA-LDA Incremental Crawling for Books Selection and Recommendation Based on Text Semantics

在线阅读下载全文

作  者:蓝燕[1] LAN Yan(Library of Huizhou University,Huizhou,516007,Guangdong,China)

机构地区:[1]惠州学院图书馆,广东惠州516007

出  处:《惠州学院学报》2020年第3期71-75,117,共6页Journal of Huizhou University

基  金:惠州市科技计划项目(2016ZX038);广东省教育厅教研教改课题(粤教高函〔2018〕180号)。

摘  要:为满足当前图书馆各专业领域最新技术图书的采购,以网络数据为基础,构筑领域关键词的本体语义库,并与图书馆学科书目库进行相似度比较,通过聚类算法选择相似度、相关度大的图书进行推介.文章建立在语义本体库类属层次基础上,首先通过Web网络对领域主题词的爬取,逐次对网络进行增量爬取以丰富语义库,再与当前图书目录的学科关键词计算文本相似度,提出了一种采用基于Entropy类属平均距离计算的近邻分类算法,最终实现一种基于相似度计算的图书的指派与推介策略.实验表明,该图书选取方法能有效地改善最新图书采购的准确度,进一步提升大数据在图书购买的效率.In order to satisfy the current purchasing of the latest technical books in various professional fields of libraries,based on network data,the ontology semantic database of domain keywords is constructed.We also compared with the subject bibliographic database of libraries in similarity and implement the book selection and recommendation with high similarity and correlation.This paper is based on the hierarchy of semantic ontology library.Firstly,the domain keywords are crawled through the Web so as to enrich the semantic database.Then,the text similarity is calculated with the subject keywords of the current book catalogue.A nearest neighbor classification algorithm based on entropy average distance calculation is proposed.Finally,a similarity-based algorithm is implemented for assignment and promotion strategies on books purchasing.Experiments show that the method about book selection can effectively improve the accuracy of the latest book purchasing,and further enhance the efficiency of big data in book purchasing.

关 键 词:本体 文本语义 增量爬取 图书推介 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象