检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:龚静[1] 黄欣阳[2] GONG Jing;HUANG Xin-yang(Department of Public Basic Courses, Hunan Polytechnic of Environment and Biology, Hengyang 421005, China;College of Computer Science, University of South China, Hengyang 421001, China)
机构地区:[1]湖南环境生物职业技术学院公共基础课部,湖南衡阳421005 [2]南华大学计算机学院,湖南衡阳421001
出 处:《计算机工程与设计》2018年第5期1340-1344,1349,共6页Computer Engineering and Design
基 金:国家自然科学基金项目(61300234);湖南省教育厅基金项目(12C1056)
摘 要:为获得更加精确稳定的文本分类结果,提出一种基于k-最近邻(k-NN)和词频-逆文档词频(TF-IDF)改进的文本分类方法,主要由文本模块、图形用户界面(GUI)模块、预处理模块、k-NN&TF-IDF模块和相似性测量共5个模块组成。在权重获取方面,对处于不同位置的特征词分别赋予不同的系数,通过构建权重矩阵,反映特征词的重要性和分布情况。在编程方面,通过执行修正的语言集查询(LINQ),优化查询效率。实验结果表明,与其它分类方法相比,该方法在分类准确率、查全率和F1测度方面具有一定优势。讨论分类器对整个文本分类框架的影响,实验结果表明,k-NN分类器比SVM分类器更适合文本分类。To obtain more accurate and stable results for text categorization,a text categorization method based on improved term frequency-inverse document frequency(TF-IDF)and k-nearest neighbor(k-NN)was proposed,which mainly contained the document module,the module of graphical user interface(GUI),the pre-processing module,and the module of k-NN&TFIDF and similarity measurement.In the aspect of weight acquisition,different coefficients were assigned to different positions,and the weight matrix was constructed to reflect the importance and distribution of feature words.In the aspect of programming,the query efficiency was optimized by executing the revised language set query(LINQ).Experimental results show that compared with other classification methods,the proposed method has certain advantages in classification accuracy rate,recall rate and the F1 measurement.In addition,the impact of the classifier on the whole text classification framework was discussed.Experimental results show,k-NN classifier is more suitable for text classification than SVM classifier.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15