检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:曹倩[1] 胡长军[1] 张云星[1] 朱于畋[1]
出 处:《计算机学报》2011年第5期899-911,共13页Chinese Journal of Computers
基 金:国家科技重大专项基金(2009ZX03004-004;2009ZX01045-005-002);国家"八六三"高技术研究发展计划项目基金(2008AA01Z109;2006AA01Z105);国家自然科学基金(60373008);教育部科学技术研究重点项目(108008;106019)资助~~
摘 要:非规则问题是大规模并行应用中普遍存在和影响程序效率的关键问题,软件Cache是Cell处理器上解决该问题的一种普遍手段.鉴于通常的软件Cache忽略了非规则引用的内存访问模式,将Cache行设定为一个固定的长度,而加重内存带宽负荷及制约Cache利用率的问题,文中提出了一种自适应的Cache行算法,它根据非规则内存访问的特点,在程序执行过程中不断地调整Cache行的大小,因此减少了传输的数据量.同时,针对不同的Cache行大小,设计了一种相应的软件Cache结构——混合行大小的Cache.它包含多种Tag项数组,每种Tag项数组对应于一种Cache行大小.该Cache设计是一种分级的结构,因为当长Cache行的Tag项数组缺失的时候直接进行缺失处理,而当短Cache行的Tag项数组发生缺失的时候启动缺失处理,同时检查长Cache行的Tag项数组是否命中,若命中,则终止缺失处理.通过对Tag项数组的分级查找,Cache的命中率有了显著的提高.除此之外,文中提出了一种新的行索引对齐的Cache替换策略,它能够在多种不同的Cache行大小并存的情况下实现LRU替换策略.实验表明该文提出的自适应的软件Cache行策略极大地减少了冗余的数据传输,提高了Cache的命中率.同时,与固定的1024B,512B,256B,128B的Cache行的性能相比,自适应的Cache行策略的执行速度分别提高了28.9%,29.7%,32.1%和33.5%.Software cache is a commonly used method which solves the irregular applications on Cell processor.Considering that software cache usually ignores the irregular reference memory access pattern and thus sets the cache line to a specific length,which elevates memory bandwidth overhead and limits cache utilization,this paper proposes an adaptive cache line strategy,which continuously adjusts cache line size during applications execution,therefore,the transferred data size is decreased significantly.Moreover,this paper presents a corresponding software cache—hybrid line size cache(HLSC).It introduces a hybrid Tag Entry Array,with each mapping to a different line size.It's a hierarchical design in that when a miss is occurred in the long line Tag Entry Array,misshandler is invoked at once.But if there is a miss in the short line Tag Entry Array,misshandler is invoked immediately as well the long line Tag Entry Array is checked.If it's a hit in the long line Tag Entry Array,misshandler is abandoned.The hit rate is efficiently increased because hierarchical lookups.Additionally,an original replacement policy—index aligned strategy(IndAlign_LRU) is proposed to implement least recently unused replacement policy for multiple cache line sizes.Performance evaluation indicates that the adaptive cache line scheme greatly decreases the reduction of data transfer and improves hit rate.Additionally,average execution speed of the HLSC is faster than that of the cache line design with 1024B,512B,256B and 128B by 28.9%,29.7%,32.1% and 33.5%,respectively.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222