基于多教师知识蒸馏的多语种仇恨言论识别

Multi-language Hate Speech Recognition Based on Multi-teacher Knowledge Distillation

作　　者：周子凡李志 ZHOU Zifan;LI Zhi(Graduate School,China People’s Police University,Langfang,Hebei Province 065000,China;School of Intelligent Policing,China People’s Police University,Langfang,Hebei Province 065000,China)

机构地区：[1]中国人民警察大学研究生院,河北廊坊065000 [2]中国人民警察大学智慧警务学院,河北廊坊065000

出　　处：《中国人民警察大学学报》2024年第10期31-38,共8页Journal of China People's Police University

摘　　要：网络社交媒体仇恨言论识别,是开源情报领域一项重要工作,针对多语种文本模型识别性能不佳、预训练模型依赖大量计算资源的问题,提出一种多教师知识蒸馏方案。首先利用多个大语言模型获取概率分布矩阵,然后依据综合后的通用相关性权重与语种优势权重生成综合软标签以指导学生模型训练。实验结果表明,经此知识蒸馏的学生模型,能够在保留各教师模型语种优势的同时大幅缩短计算时间,节约计算资源。Hate speech recognition on social media is a critical task in the field of open-source intelligence.To address the poor recognition performance of multilingual text models and the high computational resource require-ments of pre-trained models,we propose a multi-teacher knowledge distillation scheme.First,several large language models are used to obtain probability distribution matrices.Then,comprehensive soft labels are generated based on integrated general relevance weights and language-specific advantage weights to guide the student model training.Experimental results show that the student model distilled in this way can significantly reduce computation time and save computational resources while inheriting the language-specific advantages of each teacher model.

关键词：仇恨言论识别多语种文本知识蒸馏大语言模型

分类号：TP391.1[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多教师知识蒸馏的多语种仇恨言论识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多教师知识蒸馏的多语种仇恨言论识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索