UCM算法及其在电子政务网页分类系统中的应用  

UCM ALGORITHM AND ITS APPLICATION IN E-GOVERNMENT WEBPAGE CLASSIFICATION

在线阅读下载全文

作  者:李恒锐[1] 万杨亮 周继华 

机构地区:[1]电子科技大学计算机科学与工程学院 [2]95661部队自动化站 [3]重庆金美通信有限责任公司

出  处:《计算机应用与软件》2013年第1期213-215,共3页Computer Applications and Software

摘  要:针对大规模训练集的网页分类问题提出UCM(UC and SVM)分类方法。UCM算法结合了支持向量机SVM(Support Vector Machine)与无监督聚类UC(Unsupervised Clustering)的特点,使网页分类既有较高的准确率,又有较快的分类速度。在训练阶段,UCM算法利用UC方法形成聚类中心;在分类阶段,UCM算法计算待分类网页与正例中心及反例中心的距离,若距离差较大,用UC分类,否则用SVM分类。在电子政务网页分类系统中的应用表明,UCM网页分类算法在准确率方面远高于UC,略高于SVM;在分类速度上,UCM介于UC和SVM二者之间,远大于SVM。This paper presents UCM ( UC and SVM), a new algorithm of webpage classification for large training set. UCM combines the advantages of SVM (support vector machine) and UC (unsupervised clustering), makes the webpage classification highly precise with faster speed. In the training stage, UCM gets clustering centres by means of UC. In the classifying stage, UCM calculates the distance between a classifying webpage and the positive centres as well as the negative centres respectively. If the difference between the two distances is large enough, the webpage will be classified by UC, otherwise by the pruned SVM. Through the application in E-goverument webpage classification system, UCM manifests the precision much higher than UC does and a little higher than SVM does. As to the speed, UCM acts lower than UC and far higher than SVM.

关 键 词:支持向量机 聚类 大规模训练集 网页分类系统 电子政务 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象