基于TG-LDA模型的民航机务风险要素识别研究  被引量:2

Research on Risk Element Identification of Civil Aviation Maintenance Based on Text Mining

在线阅读下载全文

作  者:刘伟伟 王华伟[1] 倪晓梅 侯召国 彭珂 LIU Wei-wei;WANG Hua-wei;NI Xiao-mei;HOU Zhao-guo;PENG Ke(Nanjing University of Aeronautics and Astronautics,Nanjing 211000,China;Nanjing Vocational University of Industry Technology,Nanjing 210000,China)

机构地区:[1]南京航空航天大学,江苏南京211000 [2]南京工业职业技术大学,江苏南京210000

出  处:《航空计算技术》2023年第6期45-49,54,共6页Aeronautical Computing Technique

基  金:国家自然科学基金项目资助(72271123)。

摘  要:针对民航机务维修领域安全风险研究过程中文本数据利用不充分造成风险要素遗漏的问题,提出了基于改进LDA的机务风险要素识别模型(TF-IDF and Gaussian function-LDA,TG-LDA)。通过构建机务维修领域词典,改善文本挖掘预处理中分词精度不高的问题;针对LDA主题模型输入样本量大、噪声多的问题采用TF-IDF算法与高斯函数结合的词条双重优化模型对其优化,最终识别出26类机务维修不安全事件风险要素,并通过可视化进行了风险要素分析。结果表明,与传统算法对比,困惑度由7.19×10^(-4)降低至2.13×10^(-4),改善了文本挖掘中风险要素遗漏的问题,同时识别出机务维修领域主要的风险要素为人员认知存在偏差、维修过程违规作业、人员遗忘/疏漏、检查不全面及飞机部件出现故障。A TF-IDF and Gaussian function-LDA(TG-LDA)model based on improved LDA is proposed to address the issue of missing risk factors caused by insufficient use of textual data in the research process of safety risks in the field of civil aviation maintenance.Firstly,by constructing a dictionary in the field of maintenance,the problem of low word segmentation accuracy in text mining preprocessing can be improved;Then,aiming at the problem of large sample size and noise in the input of LDA Topic model,the dual optimization model of terms combining TF-IDF algorithm and Gaussian function is used to optimize it.Finally,26 risk factors of unsafe events in civil aviation maintenance are identified,and the risk factors are analyzed through visualization.The results show that the perplexity of this model compared with the traditional algorithm ranges from 7.19×10^(-4) reduced to 2.13×10^(-4),improved the issue of missing risk elements in text mining,and identified five main risk elements in the field of maintenance:personnel cognitive bias,maintenance process violations,personnel forgetting/negligence,incomplete inspection,and aircraft component failures.

关 键 词:文本挖掘 机务维修 LDA主题模型 TF-IDF 高斯函数 

分 类 号:X949[环境科学与工程—安全科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象