检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:文武[1,2,3] 万玉辉 张许红 文志云 WEN Wu;WAN Yu-hui;ZHANG Xu-hong;WEN Zhi-yun(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065;Research Center of New Telecommunication Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065;Chongqing Information Technology Designing Co.Ltd.,Chongqing 401121,China)
机构地区:[1]重庆邮电大学通信与信息工程学院,重庆400065 [2]重庆邮电大学通信新技术应用研究中心,重庆400065 [3]重庆信科设计有限公司,重庆401121
出 处:《计算机工程与科学》2021年第9期1645-1652,共8页Computer Engineering & Science
摘 要:针对文本数据中含有大量噪声和冗余特征,为获取更有代表性的特征集合,提出了一种结合改进卡方统计(ICHI)和主成分分析(PCA)的特征选择算法(ICHIPCA)。首先针对CHI算法忽略词频、文档长度、类别分布及负相关特性等问题,引入相应的调整因子来完善CHI计算模型;然后利用改进后的CHI计算模型对特征进行评价,选取靠前特征作为初选特征集合;最后通过PCA算法在基本保留原始信息的情况下提取主要成分,实现降维。通过在KNN分类器上验证,与传统特征选择算法IG、CHI等同类型算法相比,ICHIPCA算法在多种特征维度及多个类别下,实现了分类性能的提升。Aiming at the large amount of noise and redundant features in text data,in order to obtain a more representative feature set,a feature selection algorithm(ICHIPCA)combining improved CHI-square statistics(ICHI)and principal component analysis(PCA)is proposed.Firstly,the CHI algorithm ignores word frequency,document length,category distribution,and negative correlation characteristics,and introduces corresponding adjustment factors to improve the CHI calculation model.Secondly,the improved CHI calculation model is used to evaluate the features,and selects the top features as the primary selection feature set.Finally,PCA algorithm is used to extract the main components while basically retaining the original information to achieve dimensionality reduction.Verification on the KNN classifier shows that,compared with the traditional feature selection algorithm IG and CHI equivalent type algorithm,the ICHIPCA algorithm improves the classification performance in multiple feature dimensions and multiple categories.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49