基于BERT-BiLSTM-CRF的隧道施工安全领域命名实体识别  

Tunnel construction safety domain named entity recognition based on BERT-BiLSTM-CRF

在线阅读下载全文

作  者:张念 周彩凤 万飞 刘非 王耀耀 徐栋梁 ZHANG Nian;ZHOU Caifeng;WAN Fei;LIU Fei;WANG Yaoyao;XU Dongliang(College of Civil Engineering,Taiyuan University of Technology,Taiyuan Shanxi 030024,China;Research Center of Tunneling and Underground Engineering of Ministry of Education,Beijing Jiaotong University,Beijing 100044,China;Research Institute of Highway Ministry of Transport,Beijing 100088,China)

机构地区:[1]太原理工大学土木工程学院,山西太原030024 [2]北京交通大学隧道及地下工程教育部工程研究中心,北京100044 [3]交通运输部公路科学研究所,北京100088

出  处:《中国安全科学学报》2024年第12期56-63,共8页China Safety Science Journal

基  金:中央引导地方科技发展资金资助(YDZJSX20231A021);交通运输部公路科学研究所(院)交通强国试点项目(QG2021-3-14-1);隧道及地下工程教育部工程研究中心(北京交通大学)开放研究基金资助(TUC2024-03)。

摘  要:为解决隧道施工安全领域传统命名实体识别(NER)方法存在的实体边界模糊、小样本学习困难、特征信息提取不够全面准确等问题,提出一种基于变换器的双向编码器表征(BERT)-双向长短时记忆(BiLSTM)网络-条件随机场(CRF)模型的隧道施工事故文本实体识别方法。首先,利用BERT模型将隧道施工事故文本编码得到蕴含语义特征的词向量;然后,将BERT模型训练后输出的词向量输入BiLSTM模型进一步获取隧道施工事故文本的上下文特征并进行标签概率预测;最后,利用CRF层的标注规则的约束,修正BiLSTM模型的输出结果,得到最大概率序列标注结果,从而实现对隧道施工事故文本标签的智能分类。将该模型与其他4种常用的传统NER模型在隧道施工安全事故语料数据集上进行对比试验,试验结果表明:BERT-BiLSTM-CRF模型的识别准确率、召回率和F 1值分别达到88%、89%和88%,实体识别效果优于其他基准模型。利用所建立的NER模型识别实际隧道施工事故文本中的实体,验证了其在隧道施工安全领域中的应用效果。To solve the problems existing in the traditional NER methods in the domain of tunnel construction safety,such as fuzzy entity boundary,difficulty in small-sample learning,and insufficiently comprehensive extraction of feature information,an entity recognition method for tunnel construction accident text based on the BERT-BiLSTM-CRF model was proposed.Firstly,the BERT model was used to encode the tunnel construction accident text to obtain word vectors containing semantic features.Then,the word vectors output after the training of the BERT model were input into the BiLSTM model to further obtain the context feature of the tunnel construction accident text and conduct label probability prediction.Finally,by utilizing the constraints of the annotation rules of the CRF layer,the output result of the BiLSTM model was corrected,and the maximum probability sequence annotation result was obtained,so as to realize the intelligent classification of the labels of the tunnel construction accident texts.Comparative experiments were conducted between this model and other four commonly used traditional NER models on the tunnel construction safety accident corpus dataset.The results show that the recognition accuracy rate,recall rate and F 1 value of the BERT-BiLSTM-CRF model are 88%,89%and 88%respectively,and the entity recognition effect is better than other benchmark models.By using the established NER model to recognize the entities in the actual tunnel construction accident texts,its application effect in the domain of tunnel construction safety is verified.

关 键 词:变换器的双向编码器表征(BERT) 双向长短时记忆(BiLSTM)网络 条件随机场(CRF) 隧道施工 安全领域 命名实体识别(NER) 深度学习 

分 类 号:X928[环境科学与工程—安全科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象