检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:龚静[1] 黄欣阳[2] GONG Jing HUANG Xin-yang(Department of Information Technology, Hunan Polytechnic of Environment and Bilology, Hengyang 421001,China College of Computer Science and Technology, University of South China, Hengyang 421001, China)
机构地区:[1]湖南环境生物职业技术学院信息技术系,湖南衡阳421001 [2]南华大学计算机学院,湖南衡阳421001
出 处:《计算机工程与设计》2017年第8期2262-2268,共7页Computer Engineering and Design
基 金:湖南省教育厅基金项目(12C1056)
摘 要:为在每个文档类别中选择更多的特征,解决至少一个特征法(ALOF)的特征不足问题,提出文档特征最大值法(MFT)和改进的文档特征最大值法(IMFT)。按照数据处理方式决定选择特征的数量,MFT法解析所有文档,确保训练集中每个文档都用最终特征矢量来表示,IMFT法只分析特征评估函数中特征值高的文档以选择较少的特征,减少选择不相关特征的概率。实验考虑3个文档分类数据库和3个评估函数,实验结果表明,与ALOF法和模糊关联聚类(FRC)法相比,提出的两种方法的F1测度更高,分类效果更好,评估函数对最终的分类结果具有重要影响,不同的特征数会左右最终结果。To select more features in document classification to work on less-feature problem of at least one feature (ALOF) method, maximum feature-f text (MFT) and improved maximum feature-f text (IMFT) were proposed. The number of selected features was determined in accordance with the data processing. All documents were analyzed in MFT method to ensure that each document in the training set was represented in the final feature vectors. Whereas IMFT analyzed only the documents with high FEF valued features to select less features, and it therefore reduced the probability of selection of irrelevant features. Three data- bases of document classification and three evaluation functions were considered in the experiment. Compared with the ALOF method and method of fuzzy correlation clustering (FRC), F1 measurements of the two proposed methods are much higher, and the classification effect is better. Experimental results also show that, the evaluation function has an important influence on the final classification results, and the number of features also affects the final results.
关 键 词:文档分类 评估函数 特征最大值 F1测度 特征数
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38