基于深度学习的网站类型识别研究  

Research on website type recognition based on deep learning

在线阅读下载全文

作  者:尹杰[1] 倪鹏锐 YIN Jie;NI Pengrui(Department of Information Technology,Fuyang Industrial&Economical School,Fuyang 236032,China)

机构地区:[1]阜阳工业经济学校信息技术系,安徽阜阳236032

出  处:《电子设计工程》2023年第21期42-46,共5页Electronic Design Engineering

基  金:安徽省教育厅重点教学研究项目(2017jyxm0293);全国教育信息技术研究课题(173430009)。

摘  要:针对目前基础深度学习模型特征提取能力较弱,静态词向量模型无法表示多义词以及网站类型识别准确率不高等问题,提出了基于ERNIE2.0-MCNN-BiSRU-AT的网站类型识别模型。采用ERNIE2.0通过结合当前词的具体上下文语境学习到动态向量表征,解决静态词向量存在的一词多义问题;多特征融合网络全面地捕捉多个尺度下的局部语义和上下文序列特征,软注意力机制计算每个特征对网络分类结果的权重得分,以突出关键分类特征。线性分类层输出网站类型识别结果。在真实网站类型数据集上进行实验,相关结果表明,ERNIE2.0-MCNN-BiSRU-AT模型F1值达到了95.67%,高于实验对比的近期表现优秀的深度学习模型,并通过大量消融对比实验验证了各个功能模块的有效性。In view of the weak feature extraction ability of current basic deep learning models,the inability of static word vector models to represent polysemy words,and the low accuracy of website type recognition,a website type recognition model based on ERNIE2.0-MCNN-BiSRU-AT is proposed.ERNIE2.0 is used to learn the dynamic vector representation by combining the specific context of the current word,so as to solve polysemy in the static word vector;The multi feature fusion network comprehensively captures the local semantic and context sequence features at multiple scales.The soft attention mechanism calculates the weight score of each feature on the network classification results to highlight the key classification features.The linear classification layer outputs the website type identification result.The experiment was carried out on the real website type data set,and the relevant results showed that the F1 value of ERNIE2.0-MCNN-BiSRU-AT model reached 95.67%,which was higher than the deep learning model with excellent performance in recent years.The effectiveness of each functional module was verified by a large number of ablation comparison experiments.

关 键 词:网站分类 ERNIE2.0 多特征融合网络 软注意力 BiSRU 

分 类 号:TN-9[电子电信]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象