检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:鲍治国 王海安 胡士伟 马西锋 Bao Zhiguo;Wang Haian;Hu Shiwei;Ma Xifeng(College of Computer and Information Engineering,Henan University of Economics and Law,Zhengzhou Henan,450046)
机构地区:[1]河南财经政法大学计算机与信息工程学院,河南郑州450046
出 处:《电子测试》2022年第19期52-55,共4页Electronic Test
摘 要:目前实现智能化推荐功能,通常有两种方式,一种是基于用户的协同过滤推荐系统,另一种是基于内容相似度的推荐系统。采用协同过滤的推荐系统时,通常需要较为庞大的用户群体,本文主要选择基于内容相似度的推荐系统进行论述。在使用该系统时,往往需要对文档与对应的标题或语素进行相关性评分,通过评分对每位用户提供个性化的推荐,进而达到为每位用户提供更好地体验。这就会用到TF-IDF算法和BM25算法对文档进行相关性评分,本文对这两种方法的算法原理、优缺点以及改进方案展开论述,着重强调TF-IDF与BM25算法之间的区别与联系。At present,there are usually two ways to realize the intelligent recommendation function.One is the user based collaborative filtering recommendation system,and the other is the content similarity based recommendation system.When using collaborative filtering recommendation system,it usually needs a relatively large user group.Therefore,this paper mainly discusses the recommendation system based on content similarity.When using the system,it is often necessary to score the relevance between the document and the corresponding title or morpheme,and provide personalized recommendations to each user through scoring,so as to provide a better experience for each user.This will use TF-IDF algorithm and BM25 algorithm to score the relevance of documents.This paper discusses the algorithm principle,advantages and disadvantages and improvement scheme of these two methods,and focuses on the difference and relationship between TF-IDF and BM25 algorithm.
关 键 词:文本相似度 BM25算法 TF-IDF算法 语义化分析
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.217.174.142