一种验证分布式协议活性属性容错机制的模型检测方法  被引量:3

A Model Checking Method for Verifying the Fault Tolerance of Distributed Protocol Liveness Properties

在线阅读下载全文

作  者:陆超逸 聂长海[1] 张成志[2] LU Chao-Yi;NIE Chang-Hai;ZHENG Cheng-zhi(State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210023;The Hong Kong University of Science and Technology,Hong Kong 999077)

机构地区:[1]南京大学计算机软件新技术国家重点实验室,南京210023 [2]香港科技大学,香港999077

出  处:《计算机学报》2021年第8期1714-1731,共18页Chinese Journal of Computers

基  金:国家重点研发计划(2018YFB1003800)资助.

摘  要:云计算是一种通过网络以服务的方式向用户提供按需收费的计算资源的模式,目前企业逐渐将业务部署、数据处理转移到云计算平台上进行。因为可扩展性、性能等各方面需求,所以云平台部署在分布式系统上。由于分布式系统采用大量的商品机通过复杂的结构进行搭建,因此分布式系统中组件发生故障是无法避免的。为了提高分布式系统的可靠性,技术人员在开发分布式系统时为其设计了容错机制。为了保证容错机制在分布式系统发生故障时能真正有效地工作,故障注入是检验容错机制的方法之一,通过人为地向系统中注入特定的故障,观察系统的行为并检验容错机制是否正确工作。由于分布式系统的并发特性,传统软件测试方法无法对其进行完全测试,近年来越来越多地使用模型检测技术来对分布式系统进行验证。现有的模型检测技术注重对分布式系统的安全性属性和活性属性的检测,忽略了对容错机制尤其是活性属性容错机制的检测,所以如何验证系统的活性属性容错机制是目前面临的挑战。采用抽象模型检测方法会引入模型与实际系统不匹配的问题。同时,采用实现级模型检测方法会加剧模型检测中的状态空间爆炸问题。本文提出了一个实现级模型检测工具LTMC(Liveness Properties Fault Tolerance Model Checker),结合故障注入技术对分布式协议的安全性属性与活性属性及其容错机制进行验证。同时,基于分布式系统节点的角色,本文提出了一种对等约减策略PRP(Peer Reduction Policy)对LTMC需要搜索的状态空间进行约减,缓解了状态空间爆炸问题。此外,LTMC通过引入逻辑时钟机制,优先搜索那些更有实际价值的事件执行路径。LTMC能够有目标地在待验证系统运行的特定时刻注入特定的故障,而不依赖于随机故障注入策略;当待验证系统发生改变时,只需要简单地对工�Cloud computing is a new mode that provides users with computing resources charged on demand through a communication network as a service.At present,enterprises are gradually transferring business deployment and data processing to cloud computing platforms.Because of various requirements such as scalability and performance,cloud platforms are deployed on distributed systems.Since the distributed system uses a large number of commodity machines to build through a complex structure,technicians cannot guarantee that these machines will work correctly all the time.Therefore,component failures in the distributed system are unavoidable.In order to improve the reliability of the distributed system,the technicians designed fault-tolerant mechanisms for them when developing the distributed system.In order to ensure that the faulttolerant mechanisms can really work correctly when there is a failure in the distributed system,fault injection is one of the most effective ways to test the fault-tolerant mechanism,observing the behavior of the system and verifying whether the fault-tolerance mechanism is working correctly,by artificially injecting specific faults into the system under test.Due to the concurrent nature of distributed systems,it is very difficult for traditional software testing methods to fully test such systems.In recent years,technicians have increasingly used more systematic model checking techniques to verify distributed systems.However,existing model checking technology focuses on the checking of the safety properties and liveness properties of distributed systems,and ignores the checking of fault-tolerant mechanisms,especially liveness properties fault-tolerant mechanisms.As a result,how to verify the liveness properties fault-tolerant mechanism of the distributed system is currently a challenge.The use of abstract model checking method will introduce the problem of mismatch between the model and the actual system because of human error.At the same time,the use of implementation-level model checking method

关 键 词:分布式系统 模型检测 故障注入 活性属性 容错机制 对等约减策略 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象