基于信息熵的无标日志划分评价方法  被引量:3

Evaluation method for log partition without ground truthbased oninformation entropy

在线阅读下载全文

作  者:林雷蕾 杨良 闻立杰[1] 周华 王建民[1] LIN Leilei;YANG Liang;WEN Lijie;ZHOU Hua;WANG Jianming(School of Software,Tsinghua University,Beijing 100084,China;Inspur General Software Ltd.,Co.,Jinan 250101,China;School of Big Data and Intelligence Engineering,Southwest Forestry University,Kunming 650224,China)

机构地区:[1]清华大学软件学院,北京100084 [2]浪潮通用软件有限公司,山东济南250101 [3]西南林业大学大数据与智能工程学院,云南昆明650224

出  处:《计算机集成制造系统》2020年第6期1483-1491,共9页Computer Integrated Manufacturing Systems

基  金:国家重点研发计划资助项目(2017YFA0700605);国家自然科学基金资助项目(61472207,71690231);北京信息科学与技术国家研究中心资助项目。

摘  要:为提升模型发现的质量,可以利用日志划分将原始日志数据划分为多个子日志。现有日志划分的评价方法基本采用有标的方式来衡量划分的质量,而实际生活中很难获取到有标的日志数据。为此,提出划分熵作为无标日志划分的衡量标准。首先,定义轨迹变体用于刻画每个子日志的分布情况。其次,提出内部熵和外部熵来分别刻画子日志的内聚度和差异性。然后,利用惩罚因子对盲目迎合评价指标的划分方法进行惩罚。最后,将以上内容进行融合,形成划分熵的表达式。实验结果表明了所提方法的可行性。To improve the process discovery,log partitionis is used to divide the raw log data into multiple sub-logs.The existing methods for evaluating log partition are with ground truth,but it is difficult to obtain the marked log data in real life.For this reason,the partition entropy was proposed as a measure of log partition evaluation without ground truth.The trace variants were defined to depict the distribution of each sub-log.The internal entropy and external entropy were proposed to respectively describe the cohesion and divergence among those sub-logs.The penalty factor was used to punish some evaluation methods those blindly catering to the standard of high cohesion and low coupling.The equation of partition entropy was proposed based on internal entropy,external entropy and penalty factor.Experimental results showed the feasibility of the proposed method.

关 键 词:过程挖掘 日志划分 信息熵 轨迹聚类 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象