检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:尤苡名 YOU Yi-ming(School of Zhejiang Sci-Tech University,Information Academy,Hangzhou 310018,China)
出 处:《软件导刊》2020年第4期229-233,共5页Software Guide
摘 要:关键词抽取技术能从海量产品评论文本中挖掘出用户关注的焦点,方便后续为用户推荐合适的产品。经典关键词抽取算法TextRank在迭代计算词汇节点的重要性得分时,忽略了邻近词汇节点的影响力差异。为此,提出一种融合TFIDF与TextRank算法(简称TFTR)抽取评论中的关键词。首先,通过引入用户浏览评论后给出的评论有用性反馈,提高有效评论中出现的重要词语权重,对TFIDF算法进行改进。然后将改进后的词频逆文档频率作为词节点特征权重引入到TextRank算法中,以改进词汇节点的重要性得分分配过程。实验结果表明,相比传统的TextRank算法,TFTR算法提取出的产品评论关键词准确性在P@10标准下提高了15.7%,证明了该算法的有效性。In order to recommend proper products for users,it’s essential to make use of keyword extraction techniques to mine out what users really focus on first. As a typical keyword extraction algorithm,the TextRank suffers from the prolbem that that it ignores the different importance between adjacent nodes when computing the score of origin vocabulary nodes iteratively. Based on TFIDF and TextRank algorithms,we propose TFTR model to extract keywords from reviews. First,we take account of the helpful feedback of reviews to improve TFIDF algorithm by raising the weight of important words. Then,the improved word frequency inverse document frequency is introduced to the TextRank algorithm as the feature weight of vocabulary nodes,which enhances the importance score distribution process. The experimental result shows that compared with the traditional TextRank algorithm,the accuracy of keywords extracted by TFTR algorithm is increased by 15.7% under the P@10 standard for product reviews. This result illustrates the effectiveness of our proposed algorithm.
关 键 词:关键词抽取 TFIDF TextRank TFTR 评论有用性反馈
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222