检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李文清 高平 李光松 LI Wenqing;GAO Ping;LI Guangsong(Information Engineering University, Zhengzhou 450001, China)
机构地区:[1]信息工程大学,河南郑州450001
出 处:《信息工程大学学报》2021年第1期74-80,共7页Journal of Information Engineering University
基 金:国家自然科学基金群体资助项目(61521003)。
摘 要:DEFLATE压缩算法作为计算机领域中使用最广泛的开源压缩算法,大量网络协议和应用程序使用该算法对数据进行压缩处理。在当前大数据时代,无论对数据压缩算法进行适应性改进,还是为网络流量识别中的压缩流量识别提供依据,都需要对数据压缩算法特征进行分析研究。采用数据分析方法,根据DEFLATE算法流程特点,对算法进行模块分割,设计卡方、信息熵、加权累积和、字节游程均值4项分析指标,对DEFLATE算法特征进行分析研究,发现其包含的LZ77模块压缩性能和整个算法联系紧密,不同类型数据经过DEFLATE算法压缩后,分析指标在趋向一致的同时又表现出一定的区分性,表明DEFLATE算法在压缩效率方面较为依赖LZ77压缩模块,相比压缩前数据,压缩数据更加趋近于随机数据,另外不同类型文件对应的压缩数据,其统计特征也表现出一定的差异性。DEFLATE compression algorithm is the most widely used open source compression algorithm in the computer field.A large number of network protocols and applications use this algorithm to compress data.In the current era of big data,either for adaptive improvement of the data compression algorithm or in providing a basis for the identification of compressed data from network traffic,it is necessary to analyze and study the features of data compression algorithms.This paper adopts data analysis methods to divide the algorithm into two modules in line with the process of DEFLATE algorithm.Besides,it designs four analysis indicators-that is:chi-square,information entropy,weighted cumulative sum,and mean value of byte runs,analyzes and studies the features of DEFLATE algorithm.We find that the compression performance of the LZ77 algorithm is closely related to that of the entire algorithm.After different types of data are compressed by the DEFLATE algorithm,the analysis indicators tend to be consistent while showing a certain degree of differentiation,indicating that the DEFLATE algorithm is more dependent on the LZ77 compression algorithm in terms of compression efficiency.Compared with the original data,the compressed data is close to random data.In addition,the statistical characteristics of the compressed data corresponding to different types of files also show certain differences.
关 键 词:DEFLATE算法 数据分析 LZ77模块 算法特征
分 类 号:O212.6[理学—概率论与数理统计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171