检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:高文昀 戴胜 涂丽萍 张叶 GAO Wenyun;DAI Sheng;TU Liping;ZHANG Ye(Nanjing Les Information Technology Co.,Ltd,Nanjing 210014,Jiangsu,China;North Information Control Research Academy Group Co.,Ltd.,Nanjing 211100,Jiangsu,China)
机构地区:[1]南京莱斯信息技术股份有限公司,江苏南京210014 [2]北方信息控制研究院集团有限公司,江苏南京211100
出 处:《长江信息通信》2022年第1期46-50,共5页Changjiang Information & Communications
基 金:国家重点研发计划(2020YFC1511800)。
摘 要:现有支持向量机对于训练样本过多或训练样本中类的数量不平衡,存在训练花费时间过长和得到的分类面偏离最优分类面使得样本错分等问题。为此文章提出一种基于冗余数据消除的不平衡样本加权支持向量机方法。该方法使用费歇尔判别率准则去除训练样本集中那些对最终的分类面训练没有帮助的样本,即冗余数据,并依据训练样本对模糊分类面的贡献程度引入样本加权策略实现为不同的训练样本赋予权重。实验结果表明,该方法与传统的支持向量机相比,大大缩短了不平衡大样本数据上支持向量机的训练时间,以及减少了因数据集中样本不平衡而引起的预测样本被错分,使得支持向量机的分类性能得到了提升。It has problems in current Support Vector Machines(SVM)that too long training time is needed and misclassified samples are found which are caused by theresulting classification surface deviating from the optimal classification surface, when too many training samples or unbalanced training samples are fed in SVM. In this paper a weighted support vector machine based on unbalanced samples for redundant data removal is proposed. The method uses the Fisher Discriminant Ratio(FDR) to remove the training samples which are unhelpful for training the SVM and called redundant data, and introduces a sample weighting strategy to obtain weights from training samples according to their contribution to the fuzzy hyperplane. Simulation results show that compared to the standard SVM training, the proposed method improves the classification performance for large and unbalanced samples with shortening training time greatly and reducing the misclassification rate of testing samples.
关 键 词:支持向量机 最优分类面 冗余数据 样本加权 样本不平衡
分 类 号:TP309[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7