面向煤矿安全领域的实体识别研究  

Research on Named Entity Recognition for Coalmine Safety

在线阅读下载全文

作  者:冯琳 赵崇帅 李爽[2] 谢亚波 刘鹏 FENG Lin;ZHAO Chongshuai;LI Shuang;XIE Yabo;LIU Peng(School of Information and Control Engineering,China University of Mining and Technology,Xuzhou 221116;School of Economics and Management,China University of Mining and Technology,Xuzhou 221116;Information Center of Zibo Mine Group,Zibo 255299;National and Local Joint Engineering Laboratory of Internet Application Technology on Mine,China University of Mining and Technology,Xuzhou 221008)

机构地区:[1]中国矿业大学信息与控制工程学院,徐州221116 [2]中国矿业大学经济管理学院,徐州221116 [3]淄博矿业集团信息中心,淄博255299 [4]中国矿业大学矿山互联网(应用技术)国家地方联合工程实验室,徐州221008

出  处:《计算机与数字工程》2023年第8期1881-1887,1913,共8页Computer & Digital Engineering

基  金:国家自然科学基金项目(编号:71972176);中央高校基本科研业务费专项资金项目(编号:2020XGPY03)资助。

摘  要:煤矿安全领域实体识别是煤矿智能化的重要基础。针对煤矿安全领域无公开标注集且资料信息匮乏等问题,提出一种多轮次半自动化实体标注方法,在显著减少人工标注成本的同时,成功构造了煤矿安全领域的实体数据集。继而借鉴迁移学习的思想,将预训练语言模型RoBERTa作为词嵌入层,由双向长短期记忆网络和条件随机场进行语义解码,同时针对煤矿语料特征稀疏的特点引入注意力机制,进一步增强特征,由此构建了煤矿安全领域实体识别模型。经过实验表明该模型在煤矿安全数据集上的表现优于当下主流实体识别模型。Entity recognition in the field of coalmine safety is a crucial foundation for intelligent coalmine construction.In the face of challenges such as the absence of public annotated datasets,the paper proposes a multi-round semi-automatic entity labeling method that significantly reduces manual annotation costs while successfully constructing an entity dataset in the coalmine safety domain.Moreover,drawing inspiration from transfer learning,it employs the pre-trained language model RoBERTa as a word embedding layer,which is coupled with bidirectional long short-term memory(LSTM)networks and conditional random fields(CRF)for semantic decoding.Additionally,an attention mechanism is leveraged to address the sparse characteristics of coalmine language data and enhance the semantic features.Consequently,an entity recognition model tailored is established for the coalmine safety domain.Experimental results demonstrate the superiority of our model over current mainstream entity recognition models on the coalmine safety dataset.

关 键 词:煤矿安全 实体识别 实体标注 预训练模型 双向长短期记忆网络 注意力机制 

分 类 号:X75[环境科学与工程—环境工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象