检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杜科星 张小芳[2] 张晓[2] 赵晓南[2] Du Kexing;Zhang Xiaofang;Zhang Xiao;Zhao Xiaonan b(College of Software,Northwestern Polytechnical University,Xi’an 710072,China;College of Computer,Northwestern Polytechnical University,Xi’an 710072,China)
机构地区:[1]西北工业大学软件学院,西安710072 [2]西北工业大学计算机学院,西安710072
出 处:《计算机应用研究》2022年第3期831-835,共5页Application Research of Computers
基 金:国家重点研发计划资助项目(2018YFB1004401);北京市自然科学基金——海淀原始创新联合基金资助项目(L192027);陕西省重点产业链项目(2021ZDLGY03-02,2021ZDLGY03-08)。
摘 要:传统缓存算法存在命中率低、交换率高等问题,且现有缓存算法在分布式大数据存储系统中并不适用,为此提出了一种基于频繁序列挖掘的自适应缓存策略。该方法使用数据挖掘算法挖掘历史访问窗口内的频繁序列,将频繁序列模糊合并后构建匹配模式集合以供查询。当新的访问来临时,将固定访问长度内的子序列与匹配模式集合进行匹配,然后根据匹配结果预取数据,同时结合修改后的S4LRU(4-segmented least recently used)数据结构进行缓存数据换出。在公开的大数据处理trace集上进行了仿真实验,实验结果表明,在不同的缓存大小下,提出算法与现有典型缓存算法相比,平均命中率提高了0.327倍,平均交换率降低了0.33倍,同时具有低开销和高时效的特点。此结果表明,该方法较传统替换算法而言是一个更为有效的缓存策略。Traditional cache algorithms have problems such as low hit rate and high exchange rate. And the existing caching algorithm is not applicable in the distributed big data storage system. This paper proposed an adaptive caching strategy based on frequent sequence mining. This method used a data mining algorithm to mine the frequent sequences in the historical access window, and merged the frequent sequences to construct a set of matching patterns for query. When a new access coming, matched the subsequence within the fixed access length with the matching pattern set, and then prefetched the data according to the matching result, and combined with the modified S4 LRU(4-segmented least recently used) data structure for cache data exchange out. This paper conducted simulation experiments on the public big data processing trace set. The experimental results show that, under different cache sizes, compared with the existing typical cache algorithms, the proposed algorithm increases the average hit rate by 0.327 times and the average exchange rate reduces by 0.33 times, at the same time has the characteristics of low overhead and high time efficiency. This result shows that the proposed method is a more effective caching strategy than the traditional replacement algorithm.
分 类 号:TP3115[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7