基于SHAP解释工具的网络欺凌文本检测模型研究  

Research on Cyberbullying Text Detection Model Based on SHAP Explanations Tool

在线阅读下载全文

作  者:刘冬 刘瑞丽 翁海光 LIU Dong;LIU Ruili;WENG Haiguang(Department of Informatization and Network Security,Shang Police College,Shanghai 200137,China)

机构地区:[1]上海公安学院信息化与网络安全系,上海200137

出  处:《中国人民公安大学学报(自然科学版)》2024年第3期59-69,共11页Journal of People’s Public Security University of China(Science and Technology)

基  金:上海公安学院科研项目(23xkx53)

摘  要:针对如何快速识别社交网络平台文本内容是否为欺凌文本的问题,提出了一种基于RoBERTa-BiGRU的网络欺凌文本检测模型。该模型首先使用预训练RoBERTa抽取文本的语义特征,并使用BiGRU进行特征综合提炼;然后将RoBERTa-BiGRU分类模型在网络欺凌文本检测数据集CB-tweets上的分类性能进行了相关评估;最后引入SHAP解释工具从全局和局部两个维度对模型所识别出的关键特征和基线值进行比较分析。实验结果表明,RoBERTa-BiGRU模型具有更高的分类准确率;使用可解释工具发现RoBERTa-BiGRU在Age、Ethnicity、Gender、Religion 4个类别上计算得到的关键词与该类别的标签主题相符,但在Other CB和Not CB类别上发现的关键词多为生僻字符和连写词,模型并未真正理解Other CB和Not CB的内在特征区别。Aiming to quickly identify whether text content in social media was cyberbullying text,a cyberbullying text detection model based on RoBERTa-BiGRU was proposed.Firstly,the pretrained Ro-BERTa was used to extract semantic features of the text in the model,and BiGRU was utilized for comprehensively feature extraction.Secondly,the classification performance of the RoBERTa-BiGRU classification model was evaluated on the Cyberbullying dataset CB-tweets.Finally,the SHAP interpretation tool was introduced to compare and analyze the key features and baseline values identified by RoBERTa-BiGRU model from both global and local dimensions.Experimental results showed that RoBERTa-BiGRU model had higher classification accuracy.It was found that the keywords calculated by RoBERTa-BiGRU on Age,Ethnicity,Gender,and Religion categories matched the labels of that category by using interpretable tool.However,the keywords found on Other CB and Not CB categories were mostly rare characters and ligatures,indicating that the model did not truly understand the inherent feature differences between Other CB and Not CB categories.

关 键 词:CYBERBULLYING SHAP RoBERTa BiGRU 文本检测 

分 类 号:D918.92[政治法律—法学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象