检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]天津大学计算机科学与技术学院,天津300072 [2]东北石油大学计算机与信息技术学院,黑龙江大庆163318 [3]北京当当网信息技术有限公司内部系统开发部,北京100028
出 处:《计算机研究与发展》2013年第11期2253-2261,共9页Journal of Computer Research and Development
基 金:国家自然科学基金项目(61170019);天津市自然科学基金项目(11JCYBJC00700)
摘 要:正则化路径算法是数值求解支持向量机(support vector machine,SVM)分类问题的有效方法,它可在相当于一次SVM求解的时间复杂度内得到所有的正则化参数及对应SVM的解.现有的SVM正则化路径算法或者不能处理具有重复数据、近似数据或线性相关数据,或者计算开销较大.针对这些问题,应用正定矩阵方程组求解方法来求解SVM正则化路径,提出正定矩阵SVM正则化路径算法(positive definite SVM path,PDSVMP).PDSVMP算法将迭代方程组的系数矩阵转换为正定矩阵,并采用Cholesky分解方法求解路径上各拐点处Lagrange乘子增量向量;与已有算法中直接求解正则化参数不同,该算法根据活动集变化情况确定参数增量,并在此基础上计算正则化参数,这样保证了理论正确性和数值稳定性,并可降低计算复杂性.实例数据集及标准数据集上的实验表明,PDSVMP算法可正确处理包含重复数据、近似数据或线性相关数据的数据集,并具有较高的计算效率.The regularization path algorithm is an efficient method for numerical solution to the support vector machine (SVM) classification problem, which can fit the entire path of SVM solutions for every value of the regularization parameter, with essentially the same computational cost as fitting one SVM model. Existing SVM regularization path algorithms can neither deal with the datasets having duplicate data points, nearly duplicate points, or points that are linearly dependent efficiently, nor have efficient numerical solution. To address these issues, an improved regularization path algorithm via positive definite matrix positive definite SVM path (PDSVMP) is proposed in this paper, which provides the accurate path of SVM solutions. The coefficient matrix of the system of iteration equations is transformed into a positive definite matrix, then the Lagrange multiplier increment vector is computed by Cholesky decomposition, and the increment of regularizatio~ parameter is derived according to the changes of the active set, which is used to compute the regularization parameter on each inflection point. Such treatment is able to guarantee the theoretical correctness and numerical stability, and reduce the computational complexity. Experimental results on instance dataset and benchmark datasets show that the PDSVMP algorithm can effectively and efficiently handle datasets having duplicate data points, nearly duplicate points, or points that are linearly dependent.
关 键 词:支持向量机 正则化路径 活动集 正定矩阵 CHOLESKY分解
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33