检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]湖南工业大学,湖南株洲412007 [2]郑州旅游职业学院,河南郑州450009
出 处:《计算机仿真》2011年第10期121-124,252,共5页Computer Simulation
基 金:湖南省科技厅计划项目(2010FJ3024);湖南工业大学教学改革研究项目(09A02)
摘 要:研究网页自动分类是为快速找到用户所需网页。由于网络中网页数量相当大,而且网络是一种半结构化、海量、高维等文本,传统文本分类方法无法进行降维和消除冗余信息,易出现维数灾问题,网页分类准确率低,用户很难找到自己所需网页。为了提高网页分类准确率,提出基于主成分支持向量机的网页自动分类方法。首先对网页数据进行预处理,提取网页特征向量向量,消除冗余信息,然后采用主成分分析对网页特征向量进行降维处理,然后采用支持向量机对网页进行自动分类。对网页数据集进行仿真,结果表明,网页分类准确率达95%以上,网页分类速度较加,说明主成分支持向量机是一种有效的网页分类方法。Research data mining technology and improve the web classification accuracy.Web data has the characteristics of semi-structured,vast and high-dimension,and the traditional classification methods cannot reduce the dimension andemliminatethe redundant messege,easily causing dimension disaster problem and low web classification accuracy.In order to improve the web classification accuracy,a web automatic classification method was proposed based on principal component analysis of support vector machine.Firstly,the web data was pretreatmented and the feature vector sets were extracted.Then,the web features were reduced by principal component analysis,and the webs were classified by the support vector machine.The simulation experiments were carried out on web dataset,and the web classification accuracy is over 95%,meanwhile,the classification speed is increased.The results show that the proposed method is an effective web classification method.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.138.154.250