检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘冬 翁海光 陈一民 Liu Dong;Weng Haiguang;Chen Yimin(Shanghai Police College,Shanghai 200137,China;Shanghai Jian Qiao University,Shanghai 201306,China)
机构地区:[1]上海公安学院,上海200137 [2]上海建桥学院,上海201306
出 处:《计算机应用与软件》2024年第9期217-223,229,共8页Computer Applications and Software
基 金:上海公安学院科研项目(23xkx53)。
摘 要:针对110报警类警情文本数据存在着文本长度极短且样本类别分布严重不均衡的问题,提出一种BERT-BiGRU-WCELoss警情分类模型。该模型通过中文预训练BERT(Bidirectional Encoder Representations from Transformers)模型抽取文本的语义;使用BiGRU(Bidirectional Gated Recurrent Unit)综合提炼文本的语义特征;通过优化自适应权重损失函数WCELoss(Weight Cross Entropy Loss function)给少数类样本赋予更大的损失权重。实验结果表明:该模型在某市2015年某一自然月的110报警数据集上取得了95.83%的分类准确率,精准率、召回率、F1值和G_mean均高于传统深度学习模型和交叉熵损失训练的模型。In response to the problem of extremely short text length and severely imbalanced distribution of sample categories in 110 alarm text data,this paper proposes a BERT-BiGRU-WCELoss alarm classification model.The model extracted the semantics of the text through the Chinese pre trained BERT(Bidirectional Encoder Representations from Transformers)model.BiGRU(Bidirectional Gated Recurrent Unit)was used to comprehensively extract the semantic features of the text.By optimizing the adaptive weight loss function WCELoss(Weight Cross Entropy Loss function),larger loss weights were assigned to minority class samples.The experimental results show that the model achieved a classification accuracy of 95.83%on the 110 alarm dataset of a certain natural month in 2015 in a certain city,with higher accuracy,recall rate,F1 value,and G_Mean than traditional deep learning models and models trained with cross entropy loss.
关 键 词:BERT BiGRU 警情分类 非均衡数据 短文本 样本加权
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.133.106.206