基于相似度匹配的微服务故障诊断方法  被引量:8

Fault Diagnosis Method Based on Trace Similarity Matching

在线阅读下载全文

作  者:陈皓 许源佳 王焘[1,3] 张文博[1,3] CHEN Hao;XU Yuan-Jia;WANG Tao;ZHANG Wen-Bo(Institute of Software,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China;State Key Laboratory of Computer Science,Institute of Software,Chinese Academy of Sciences,Beijing 100190,China)

机构地区:[1]中国科学院软件研究所,北京100190 [2]中国科学院大学,北京100049 [3]中国科学院软件研究所计算机科学国家重点实验室,100190

出  处:《计算机系统应用》2021年第5期1-11,共11页Computer Systems & Applications

基  金:国家重点研发计划(2017YFB1400804);国家自然科学基金(61872344);北京市自然科学基金(4182070);中国科学院青年创新促进会人才专项(2018144)。

摘  要:随着互联网服务的快速发展,分布式的微服务应用逐渐取代传统的单体应用成为互联网应用的主要形式之一.微服务应用在具有可伸缩性、容错性、高可用性等优点的同时,也存在着构建繁琐、部署复杂和维护困难等挑战.面向云计算环境的微服务监测与运维是当前的研究热点,但仍然存在粒度较粗、故障定位不准确等缺点.针对以上问题,本文提出了一种基于模式匹配的微服务故障诊断方法.首先,使用注入代理转发请求流量的方式收集并建模微服务的追踪信息;然后,收集系统正常运行下的状态信息,并通过注入已知故障来收集并刻画故障发生后应用的运行状态;最后,将未知故障的执行追踪信息与已知故障的执行追踪信息相匹配,采用字符串编辑距离衡量相似度以诊断可能的故障原因.实验结果表明,该方法可以有效刻画请求的处理执行追踪信息,以微服务为粒度准确定位应用的故障原因.Along with the rapid development of internet services,the distributed microservice-based application has gradually replaced the traditional application as one of the main forms of Internet applications.Distributed microservicebased applications boast scalability,high fault tolerance,and great availability,but they are often challenged by cumbersome installation,complicated deployment,and difficult maintenance.Kubernetes,as the most popular containerbased cluster management system,is affected by coarse grains,inaccurate fault location,and other weaknesses.To address the above issues,this study proposes a fault detection method based on trace similarity matching:First,use injecting proxy to forward request traffic to collect tracking information about microservices.Then,collect the state information during normal operation of the system and record the performance of the system after the failure occurs by injecting known faults.Finally,take string edit distance as the standard for the execution tracking models of unknown and known faults.The edit distance serves as a standard to measure the similarity,and the possible cause of failure is identified.Experimental results show that the method can accurately describe the processing and execution tracking information of the request and find the cause of system failure with microservices as the granularity.

关 键 词:云计算 故障诊断 执行轨迹 微服务 

分 类 号:TP393.0[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象