面向新闻文本的分类方法的比较研究  被引量:10

A comparative study of classification methods for news texts

在线阅读下载全文

作  者:刘测 韩家新[1] LIU Ce;HAN Jiaxin(School of Computer,Xi'an Shiyou University,Xi'an 710065,China)

机构地区:[1]西安石油大学计算机学院,西安710065

出  处:《智能计算机与应用》2018年第5期38-41,共4页Intelligent Computer and Applications

摘  要:文本分类是根据文档内容将文档分类为预定义类别的过程。文本分类是文本检索系统的必要要求,文本检索系统响应用户的查询检索文本,而文本理解系统以某种方式转换文本,如生成摘要,回答问题或提取数据[1]。本文中将运用朴素贝叶斯、支持向量机、K最近邻、fast Text这4种方法来进行新闻文本分类,并比较了各种算法的分类性能、复杂度等方面的优缺点,最后评述了精确度和时间2种分类器常用的性能评价指标[2]。Text classification is the process of classifying documents into predefined categories based on the content of the documents. Text classification is a necessary requirement for a text retrieval system. A text retrieval system responds to a user's query to retrieve text,and a text understanding system converts text in some way,such as generating a digest,answering a question,or extracting data[1]. This paper applies such four methods as Na6 ve Bayes,SVM,KNN and fast Text for news text classification,then compares the advantages and disadvantages in classification performance and complexity,as well as other aspects among the aboved methods. Finally,the paper comments on performance evaluation indicators commonly used in two classifiers,which are the accuracy and time[2].

关 键 词:文本分类 新闻文本 朴素贝叶斯 支持向量机 K最近邻 fastText 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象