检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《科学技术与工程》2007年第21期5563-5566,共4页Science Technology and Engineering
基 金:广东省自然科学基金(04020079);吉林大学符号计算与知识工程教育部重点实验室开放课题(93K-17-2006-03);华南理工大学自然科学基金(B13-E5050190)资助
摘 要:大样本的学习是支持向量机领域中的一个重要课题。基于数据分割和邻近对策略,提出了一种新的支持向量机分类算法。在新的算法中,首先利用c均-值聚类分别对数据集中的正负类进行聚类,把大数据集分割成互不相交的子集合;然后来自正负类的子集合两两组合形成多个二分类问题,并用SMO算法求解;最后用邻近对策略对未知数据进行识别。为了验证新算法的有效性,把它应用于5个UCI数据集,并和SMO算法做了比较。结果表明:新算法不仅大大地减少了大样本学习的训练时间,而且相应的测试精度几乎没有降低。It is an important issue how to train the large scale classification problems in the field of support vector machine.A fast support vector machine classification algorithm is presented to deal with this problem based on data partition and neighborhood pair strategy.In the proposed algorithm,c-means clustering is firstly adopted to cluster each of two classes from the training set respectively;Then m×n binary classification problems are formed based on the clustering results.Finally,based on the neighborhood pair strategy,for each sample a binary classifier which constructed by two nearest subsets from two classes is chosen to identify it.The experiments are conducted on five benchmarking UCI datasets for testing the generalization performance of the proposed algorithm.The experimental results show that the training time of the proposed algorithm is largely reduced without decreasing the predicting accuracy.
分 类 号:TP311.12[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3