BTM-BERT模型在民航机务维修安全隐患自动分类中的应用  

Application of the BTM-BERT model in the automatic classification of safety hazards in civil aircraft maintenance

在线阅读下载全文

作  者:陈芳[1] 张亚博 CEHN Fang;ZHANG Yabo(School of Safety Science and Engineering,Civil Aviation University of China,Tianjin 300300,China)

机构地区:[1]中国民航大学安全科学与工程学院,天津300300

出  处:《安全与环境学报》2024年第11期4366-4373,共8页Journal of Safety and Environment

基  金:2024年民航局安全能力项目(ASSA202401)。

摘  要:为界定民航机务维修安全隐患类别,实现安全隐患数据的自动分类,首先,利用构建的机务维修停用词库对安全隐患记录语料进行预处理。其次,运用词对主题模型(Biterm Topic Model,BTM)提取主题和关键词,确定了“员工未按规定对工作现场进行监管”等12类安全隐患。最后,根据BTM主题模型标注的数据集对算法进行微调,构建了基于变换器的双向编码(Bidirectional Encoder Representations from Transformers,BERT)算法的机务维修安全隐患记录自动分类模型,并与传统的分类算法进行对比。结果表明:所构建的模型可以实现民航机务维修安全隐患自动分类,其效果远高于传统机器学习支持向量机算法的效果,构建的分类模型的精确率、召回率和F 1较文本卷积神经网络算法分别提升了0.12、0.14和0.14,总体准确率达到了93%。To define the types of safety hazards in civil aviation maintenance and obtain the automatic classification of safety hazard data,we first used the hidden danger database of civil aviation maintenance units from 2020-2023 as the total data source for text mining.The corpus of safety risk records was pre-processed,a custom dictionary of maintenance safety risk records was established,and Chinese word segmentation was performed using the Jieba package in Python.Next,the Gibbs sampling method was used to calculate perplexity and determine the optimal number of topics.The Biterm Topic Model(BTM)topic model was applied to extract topics and keywords,identifying 12 types of safety hazards such as"employees failing to supervise the work site as required"and"employees failing to complete airline maintenance according to regulations."The top 15 important keywords under each topic were also identified.Finally,the Bidirectional Encoder Representations from Transformers(BERT)-Base-Chinese model released by Google was chosen as the pre-training model.This model consists of 12 layers of Transformer encoders,each with 12 self-attention heads,totaling 110 million parameters.The algorithm was fine-tuned using the dataset labeled by the BTM topic model,constructing an automatic classification model of maintenance safety hazard records based on the BERT algorithm.The model predicted the hazard categories in the maintenance safety hazard database by analyzing event summaries.Precision,recall,and F 1 scores were selected as evaluation metrics for the training results,and the model was compared with traditional classification algorithms.This study shows that the constructed model can achieve automatic classification of civil aviation maintenance safety hazards.It not only outperforms other algorithms in individual categories but also significantly surpasses the overall effectiveness of traditional machine learning support vector machine algorithms.The precision,recall,and F 1 scores of the constructed classification model are respecti

关 键 词:安全工程 机务维修 词对主题模型(BTM) 基于变换器的双向编码(BERT) 安全隐患 文本分类 

分 类 号:X93[环境科学与工程—安全科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象