基于内容相似度的相关性评分算法对比分析研究  被引量:2

Comparative analysis of correlation scoring algorithms based on content similarity

在线阅读下载全文

作  者:鲍治国 王海安 胡士伟 马西锋 Bao Zhiguo;Wang Haian;Hu Shiwei;Ma Xifeng(College of Computer and Information Engineering,Henan University of Economics and Law,Zhengzhou Henan,450046)

机构地区:[1]河南财经政法大学计算机与信息工程学院,河南郑州450046

出  处:《电子测试》2022年第19期52-55,共4页Electronic Test

摘  要:目前实现智能化推荐功能,通常有两种方式,一种是基于用户的协同过滤推荐系统,另一种是基于内容相似度的推荐系统。采用协同过滤的推荐系统时,通常需要较为庞大的用户群体,本文主要选择基于内容相似度的推荐系统进行论述。在使用该系统时,往往需要对文档与对应的标题或语素进行相关性评分,通过评分对每位用户提供个性化的推荐,进而达到为每位用户提供更好地体验。这就会用到TF-IDF算法和BM25算法对文档进行相关性评分,本文对这两种方法的算法原理、优缺点以及改进方案展开论述,着重强调TF-IDF与BM25算法之间的区别与联系。At present,there are usually two ways to realize the intelligent recommendation function.One is the user based collaborative filtering recommendation system,and the other is the content similarity based recommendation system.When using collaborative filtering recommendation system,it usually needs a relatively large user group.Therefore,this paper mainly discusses the recommendation system based on content similarity.When using the system,it is often necessary to score the relevance between the document and the corresponding title or morpheme,and provide personalized recommendations to each user through scoring,so as to provide a better experience for each user.This will use TF-IDF algorithm and BM25 algorithm to score the relevance of documents.This paper discusses the algorithm principle,advantages and disadvantages and improvement scheme of these two methods,and focuses on the difference and relationship between TF-IDF and BM25 algorithm.

关 键 词:文本相似度 BM25算法 TF-IDF算法 语义化分析 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象