检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]电子科技大学计算机科学与工程学院 [2]95661部队自动化站 [3]重庆金美通信有限责任公司
出 处:《计算机应用与软件》2013年第1期213-215,共3页Computer Applications and Software
摘 要:针对大规模训练集的网页分类问题提出UCM(UC and SVM)分类方法。UCM算法结合了支持向量机SVM(Support Vector Machine)与无监督聚类UC(Unsupervised Clustering)的特点,使网页分类既有较高的准确率,又有较快的分类速度。在训练阶段,UCM算法利用UC方法形成聚类中心;在分类阶段,UCM算法计算待分类网页与正例中心及反例中心的距离,若距离差较大,用UC分类,否则用SVM分类。在电子政务网页分类系统中的应用表明,UCM网页分类算法在准确率方面远高于UC,略高于SVM;在分类速度上,UCM介于UC和SVM二者之间,远大于SVM。This paper presents UCM ( UC and SVM), a new algorithm of webpage classification for large training set. UCM combines the advantages of SVM (support vector machine) and UC (unsupervised clustering), makes the webpage classification highly precise with faster speed. In the training stage, UCM gets clustering centres by means of UC. In the classifying stage, UCM calculates the distance between a classifying webpage and the positive centres as well as the negative centres respectively. If the difference between the two distances is large enough, the webpage will be classified by UC, otherwise by the pruned SVM. Through the application in E-goverument webpage classification system, UCM manifests the precision much higher than UC does and a little higher than SVM does. As to the speed, UCM acts lower than UC and far higher than SVM.
关 键 词:支持向量机 聚类 大规模训练集 网页分类系统 电子政务
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.219.31.133