检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]六盘水师范学院计算机科学与信息技术系,贵州六盘水553004 [2]中国矿业大学计算机科学与技术学院,江苏徐州221116
出 处:《沈阳工业大学学报》2018年第1期82-87,共6页Journal of Shenyang University of Technology
基 金:贵州省科学技术基金计划资助项目(20157606);贵州省教育厅青年科技人才成长资助项目(2016267)
摘 要:针对传统网页分类中存在的准确率和查全率不高、分类效率低的情况,提出一种基于朴素贝叶斯分类的网页预分类算法.算法根据用户的网上活动情况提取相关网址,分析网页内容和网页关键词,利用朴素贝叶斯分类算法进行分类,根据用户对各类网页的浏览情况分析用户的行为特征.采用改进的文本权值计算方法,并引进网址预分类机制,提高数据的处理效率以及分类的准确率.结果表明,网址分类算法准确,能够充分发掘用户的兴趣喜好,可以作为用户行为分析的数据算法进行商业推广和司法取证.Aiming at the situation that the accuracy and recall rate of traditional web page classification are not high and the classification efficiency is low,a web page pre-classification algorithm based on Naive Bayes classification was proposed. According to the online activity situation of users,the relevant websites were extracted, the contents and keywords of web pages were analyzed, and the classification was performed with the Naive Bayes algorithm. According to the browse situation of users on various web pages,the behavior characteristics of users were analyzed. The improved web text weight calculation method was adopted, the web site pre-classification mechanism was introduced, and the processing efficiency of data and classification accuracy were improved. The results showthat the web site classification algorithm is accurate,can fully explore the interest and preference of users,and can be applied in both the commercial popularization and forensic evidence as the data algorithm for the behavior analysis of users.
关 键 词:网页关键词 朴素贝叶斯 网页分类 行为特征 权值计算方法 网址预分类 商业推广 司法取证
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249