检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘测 韩家新[1] LIU Ce;HAN Jiaxin(School of Computer,Xi'an Shiyou University,Xi'an 710065,China)
出 处:《智能计算机与应用》2018年第5期38-41,共4页Intelligent Computer and Applications
摘 要:文本分类是根据文档内容将文档分类为预定义类别的过程。文本分类是文本检索系统的必要要求,文本检索系统响应用户的查询检索文本,而文本理解系统以某种方式转换文本,如生成摘要,回答问题或提取数据[1]。本文中将运用朴素贝叶斯、支持向量机、K最近邻、fast Text这4种方法来进行新闻文本分类,并比较了各种算法的分类性能、复杂度等方面的优缺点,最后评述了精确度和时间2种分类器常用的性能评价指标[2]。Text classification is the process of classifying documents into predefined categories based on the content of the documents. Text classification is a necessary requirement for a text retrieval system. A text retrieval system responds to a user's query to retrieve text,and a text understanding system converts text in some way,such as generating a digest,answering a question,or extracting data[1]. This paper applies such four methods as Na6 ve Bayes,SVM,KNN and fast Text for news text classification,then compares the advantages and disadvantages in classification performance and complexity,as well as other aspects among the aboved methods. Finally,the paper comments on performance evaluation indicators commonly used in two classifiers,which are the accuracy and time[2].
关 键 词:文本分类 新闻文本 朴素贝叶斯 支持向量机 K最近邻 fastText
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.199