检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吴晓英[1] 朱劲松 Wu Xiaoying;Zhu Jinsong(Library of Chongqing University of Science&Technology,Chongqing 401331;Changan Ford Motor Company,Chongqing 401331)
机构地区:[1]重庆科技学院图书馆,重庆401331 [2]长安福特汽车有限公司,重庆401331
出 处:《情报探索》2022年第3期49-55,共7页Information Research
基 金:重庆市教育委员会人文社会科学研究项目“双创背景下重庆市众创空间的知识传播与用户接受研究”(项目编号:19SKGH198);重庆市社会科学规划委托项目“双一流背景下高校图书馆助力人才创新能力培育模式研究”(项目编号:2019WT38);重庆市图书馆学会科研课题“基于体验营销的高校图书馆助力学生创新能力培育路径研究”(项目编号:CQTX202012)成果之一。
摘 要:[目的/意义]针对现有网民网站访问分析方法存在样本规则库更新,对新网站的访问难以提供识别分析等问题,使用BI-LSTM、BI-LSTM+Attention算法构建网站识别模型,实现网民访问网站的意图和安全性识别预测。[方法/过程]使用BI-LSTM对网站进行多结构分析识别,根据网站链接的结构特性提取出域名信息和参数信息作为主要分析数据,爬虫获取部分知名域名信息构建语料库,使用Word2vec来得到网站链接中域名的词向量特征作为第一种网站结构识别检测,TF-IDF结合N-Gram算法来得到网站链接中参数的特征向量作为第二种网站结构识别检测,构建网站识别模型。[结果/结论]多结构网民网站分析模型的识别分析方法适合各年龄段的网民和各阶段水平信息能力的用户进行识别分析,深度学习与网站结构结合的识别检测方法在上网过程的检测识别中具有维护健康上网环境的作用。[Purpose/significance]The article aims at the problems of updating the sample rule database in the existing Internet user website access analysis methods,and it is difficult to provide identification analysis for new website visits. BI-LSTM and BI-LSTM+ Attention algorithms are used to construct a website identification model to realize Internet users’ recognize and predict the intent and safety of visiting the website. [Method/process]The paper uses BI-LSTM to analyze and identify the website with multiple structures. According to the structural characteristics of the website link,the domain name information and parameter information are extracted as the main analysis data. The crawler obtains some well-known domain name information to build a corpus,and uses Word2 vec to get the website. The word vector feature of the domain name in the link is used as the first type of website structure recognition and detection. TF-IDF combines the N-Gram algorithm to obtain the feature vector of the parameters in the website link as the second type of website structure recognition and detection to construct a website recognition model. [Result/conclusion] The identification analysis method of the multi-structure netizen website analysis model is suitable for netizens of all ages and users of all levels of information ability to perform identification analysis. The identification and detection method of the combination of deep learning and website structure plays a role in maintaining a healthy online environment in the detection and identification of the online process.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7