Python环境下的航空安全报告信息分析方法  被引量:4

The Analysis Method of Aviation Safety Reporting Information Based on Python

在线阅读下载全文

作  者:刘俊杰[1] 杜尹岚 闫慧娟 LIU Jun-jie;DU Yin-lan;YAN Hui-juan(Research Institute of Civil Aviation Safety,Civil Aviation University of China,Tianjin 300300,China)

机构地区:[1]中国民航大学民用航空安全科学研究所,天津300300

出  处:《科学技术与工程》2021年第10期4278-4283,共6页Science Technology and Engineering

基  金:民航安全能力建设资金(20600719)。

摘  要:为了从大量日常收集的航空安全信息中快捷、准确、高效地获取可能存在的安全隐患,为安全风险控制提供明确的改进方向,结合文本分析和机器学习对给定类型的航空安全信息根据其内容聚类是挖掘有效信息的重要基础。以2017年中国民航收集的系统失效/卡阻/故障事件为样本,在Python 3.6环境下通过对文本预处理,采用对数的词频-逆文本频率(term frequency-inverse document frequency, TF-IDF)进行特征提取以及K-means方法,建立该样本信息的自动聚类模型,基于多维缩放(multi dimensional scale, MDS)降维输出可视化结果。分析结果表明,文本信息聚类和可视化能够快速自动地对信息整理归档,识别各样本信息之间相似程度,轻松锁定关键信息,为下一步风险管控提供有针对性的措施。To quickly, accurately, and efficiently obtain potential safety hazards from a large amount of daily collected aviation safety information and provide a clear improvement direction for safety risk control, combing the text analysis with machine learning and clustering a given type of aviation safety information according to its content are the important bases for effective data mining. Taken the system failure/jam/fault events collected by Chinese civil aviation in 2017 as a sample, by means of text preprocessing and feature extraction by logarithmic TF-IDF(term frequency-inverse document frequency) and k-means method based on Python 3.6, an automatic clustering model of the sample information was established and the results were visualized as the output based on multi dimensional scale(MDS). Results show that text clustering and visualization can quickly and automatically file the information, identify the similarity among sample information, easily lock key information, and provide targeted measures for the next step of risk management and control.

关 键 词:航空安全信息 信息分析 文本聚类 聚类可视化 

分 类 号:X949[环境科学与工程—安全科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象