基于Bi-LSTM神经网络的短文本敏感词识别方法  被引量:2

A Short Text Sensitive Word Recognition Method Based on Bi-LSTM Neural Network

在线阅读下载全文

作  者:周军芽 吴进伟 吴广飞 张何为 ZHOU Junya;WU Jinwei;WU Guangfei;ZHANG Hewei(Lishui Power Supply Company,State Grid Zhejiang Electric Power Co.,Ltd,Lishui 323000,China;不详)

机构地区:[1]国网浙江省电力有限公司丽水供电公司,浙江丽水323000

出  处:《武汉理工大学学报(信息与管理工程版)》2024年第2期312-316,共5页Journal of Wuhan University of Technology:Information & Management Engineering

摘  要:为了准确识别与处理敏感词,针对分词时延较高、识别精度较低的问题,提出基于双向长短期记忆(Bi-LSTM)神经网络的短文本敏感词识别方法。分析敏感词库,将敏感词库划分为两大类、三个等级,预处理短文本干扰信息(特殊字符、繁体字与拆分汉字),引入Bi-LSTM神经网络构建短文本分词模型,二次训练确定最佳参数,反复计算词语的敏感性数值,通过敏感性对比函数,提取短文本敏感词,并匹配敏感词库,确定敏感词的类别与等级,实现短文本敏感词识别。实验结果表明:在不同实验组别下,应用本文方法获得的短文本分词时延均低于给定最大限值,短文本敏感词识别精度高于84.42%,应用性能较佳。In order to accurately identify and process sensitive words,a short text sensitive word recognition method based on bidirectional long short term memory(Bi-LSTM)neural network was proposed to address the issues of high segmentation delay and low recognition accuracy.By analyzing the sensitive lexicon,the sensitive lexicon was divided into two categories and three levels,and the short text interference information(special characters,traditional characters and split Chinese characters)was preprocessed.The Bi-LSTM neural network was introduced to construct a short text segmentation model.The optimal parameters were determined by secondary training,and the sensitivity values of words were calculated repeatedly.Through the sensitivity comparison function,the short text sensitive words were extracted,and the sensitive lexicon was matched to determine the category and level of sensitive words,so as to realize the recognition of short text sensitive words.The experimental results showed that in different experimental groups,the short text segmentation delay obtained by applying the method proposed in this paper is lower than the given maximum limit,and the recognition accuracy of sensitive words in short text is higher than 84.42%,indicating better application performance.

关 键 词:短文本 敏感词识别 文本过滤 编辑距离 双向长短期记忆神经网络 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象