基于优化的MsEclat算法的铁路机车事故故障关联规则挖掘  被引量:11

Association Rule Mining for Railway Locomotive Accident and Fault Based on Optimized MsEclat Algorithm

在线阅读下载全文

作  者:李鑫[1] 史天运 常宝 马小宁 刘军 LI Xin;SHI Tianyun;CHANG Bao;MA Xiaoning;LIU Jun(Postgraduate Department,China Academy of Railway Sciences,Beijing 100081,China;Department of Science,Technology and Information Technology,China Academy of Railway Sciences Corporation Limited,Beijing 100081,China;Institute of Computing Technology,China Academy of Railway Sciences Corporation Limited,Beijing 100081,China)

机构地区:[1]中国铁道科学研究院研究生部,北京100081 [2]中国铁道科学研究院集团有限公司科技和信息化部,北京100081 [3]中国铁道科学研究院集团有限公司电子计算技术研究所,北京100081

出  处:《中国铁道科学》2021年第4期155-165,共11页China Railway Science

基  金:中国国家铁路集团有限公司科技研究开发计划重大课题(P2019G003)。

摘  要:为从铁路机车大数据中挖掘出与机车事故故障有关的关联规则,提出1种优化的MsEclat算法。先提出改进的Eclat算法——MsEclat算法,构建最小支持度索引表,以各项目的支持度值为排序依据重新构建数据集,依据垂直挖掘思想获得针对不同项目的频繁项集,解决Eclat算法无法在多最小支持度情况下挖掘关联规则的缺陷;进一步改进得到优化的MsEclat算法,在融合布尔矩阵、并行计算编程模型MapReduce基础上,设计频繁项集挖掘步骤,提高算法在大数据分析场景下的执行效率。通过算法对比,验证MsEclat算法及其优化算法在多最小支持度关联规则挖掘方面的计算效率优势。最后,以某铁路局的机车运转养护大数据为例,采用优化的MsEclat算法,挖掘机车事故故障的关联规则。结果表明:该算法在6个分布式节点的情况下耗时3.945034 s,挖掘得到频繁项集156条,如运用故障高发的机车中,83.78%的概率会同时出现频次较多的行车安全装备问题等;形成相应关联规则后,可用于分析该局机车的事故故障发生情况及质量安全状态。In order to mine the association rules related to locomotive accidents and faults from railway locomotive big data,an optimized MsEclat algorithm is proposed.Firstly,an improved Eclat algorithm,MsEclat algorithm,is proposed.By constructing the minimum support index table,the data set is reconstructed according to the support value of each item as ordering,and the frequent item sets for different items are obtained according to the vertical mining idea,to solve the defect that the Eclat algorithm cannot mine association rules in case of multiple minimum support.Secondly,the optimized MsEclat algorithm is further improved.Based on the integration of Boolean matrix and parallel computing programming model MapReduce,frequent item set mining steps are designed to improve the execution efficiency of algorithm in big data analysis scenarios.The comparison of different algorithms shows that MsEclat algorithm and its optimization algorithm have great advantages in computing efficiency for mining association rules with multiple minimum supports.Finally,taking the big data of locomotive operation and maintenance of a railway bureau as an example,the optimized MsEclat algorithm is used to mine the association rules of locomotive accidents and faults.The results show that the optimized MsEclat algorithm takes 3.945034 s in the case of 6 distributed nodes,and 156 frequent item sets are mined.For example,in the use of locomotives with high frequency operation faults,83.78%of the probability will simultaneously appear more driving safety equipment problems.After forming these association rules,they can be used to analyze the occurrence of accidents and faults as Well as the quality and safety state of locomotives in the railway bureau.

关 键 词:机车事故故障 关联规则 大数据分析 数据挖掘技术 MsEclat算法 多最小支持度 

分 类 号:TP301.[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象