调试中基于文法编码的日志异常检测算法  

A Log Anomaly Detection Algorithm for Debugging Based on Grammar-Based Codes

在线阅读下载全文

作  者:王楠[1,2] 韩冀中[3] 方金云[1] 

机构地区:[1]中国科学院计算技术研究所高性能计算机研究中心,北京100190 [2]中国科学院大学,北京100190 [3]中国科学院信息工程研究所,北京100195

出  处:《计算机研究与发展》2013年第4期677-685,共9页Journal of Computer Research and Development

基  金:国家自然科学基金项目(61070028;61003063;60903047);国家"八六三"高技术研究发展计划基金项目(2011AA01A203);中国科学院先导专项基金项目(XDA06030200)

摘  要:调试软件中的非确定错误对软件开发有重要意义.近年来,随着云计算系统的快速发展和对录制重放调试方法研究的深入,使用异常检测方法从大量文本日志或控制流日志等数据中找出异常的信息对调试愈发重要.传统的异常检测算法大多是为检测和防范攻击而设计的,它们很多基于马尔可夫假设,对事件流上的剧烈变化很敏感.但是新的问题要求异常检测能够检出语义级别的异常行为.实验表明现有的基于马尔可夫假设的异常检测算法在这方面表现不佳.提出了一种新的基于文法编码的异常检测算法.该算法不依赖于统计模型、概率模型、机器学习及马尔可夫假设,设计和实现都极为简单.实验表明在检测高层次的语义异常方面,该算法比传统方法有优势.Debugging non-deterministic bugs has long been an important research area in software development. In recent years, with the rapid emerging of large cloud computing systems and the development of record replay debugging, the key of such debugging problem becomes mining anomaly information from text console logs and/or execution flow logs. Anomaly detection algorithms can therefore be used in this area. However, although many approaches have been proposed, traditional anomaly detection algorithms are designed for detecting network attacking and not suitable for the new problems. One important reason is the Markov assumption on which many traditional anomaly detection methods are based. Markov-based methods are sensitive to harshly trashing in event transitions. In contrast, the new problems in system diagnosing require the abilities of detecting semantic misbehaviors. Experiment results show the powerless of Markov-based methods on those problems. This paper presents a novel anomaly detection algorithm which is based on grammar-based codes. Different from previous approaches, our algorithm is a non-Markov approach. It doesn't rely on statistic modeling, probability modeling or machine learning. Its principle is simple, and the algorithm is easy to implement. The new algorithm is tested on both generated sequences and real logs, and all tests results are positive. Compared with traditional methods, it is more sensitive to semantic misbehaviors.

关 键 词:调试 异常检测 文法编码 数据挖掘 录制重放 

分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象