检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郑康 李晨[1] 陈海燕[1] 刘胜[1] 方粮[1] ZHENG Kang;LI Chen;CHEN Hai-yan;LIU Sheng;FANG Liang(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)
机构地区:[1]国防科技大学计算机学院,湖南长沙410073
出 处:《计算机工程与科学》2023年第11期1929-1940,共12页Computer Engineering & Science
基 金:国家自然科学基金(62202478);国防科技大学科研项目(ZK20-04)。
摘 要:近年来,随着集成电路技术的发展处理器与存储器之间的速度差异越来越大,存储器愈发成为制约计算系统性能的瓶颈。对于嵌入式、低功耗领域的DSP而言,其架构和应用场景与通用CPU不同,CPU的访存设计难以满足DSP的访存需求。针对超长指令字DSP在访存实时性、顺序与固定延迟、高效数据一致性方面的需求,设计了一种适用于DSP的标量访存单元,可配置的设计能够满足DSP的访存实时性;基于ID的顺序机制保证超长指令字架构对Load指令返回数据的顺序与固定延迟要求,存储开销为87.5 B;硬件查找“首1”加速了数据一致性所需的写回操作。当Cache中25%,50%和75%的行需要写回时,优化后的一致性写回开销为逐行扫描方法的26.4%,51.3%和76.2%,只与有效脏行数量成正比,与Cache容量无关。In recent years,the speed difference between processors and memories has become increasingly larger with the development of integrated circuit technology,and memories have increasingly become the bottleneck that limits the performance of computing systems.For DSPs in embedded and low-power consumption areas,their architectures and application scenarios are different from general-purpose CPUs,and the memory access design of CPUs cannot meet the memory access requirements of DSPs.To address the requirements of Very Long Instruction Word(VLIW)DSPs in terms of real-time memory access,order and fixed delay,and efficient data consistency,a scalar memory access unit suitable for DSPs is designed.The configurable design can meet the real-time memory access requirements of DSPs.The ID-based ordering mechanism ensures the order and fixed delay requirements of VLIW with a storage overhead of 87.5 B.The write back operation,designed for data consistency,is accele-rated by searching leading-one in hardware.The time consumed by the optimized write back operation are 26.4%,51.3%and 76.2%,compared to the basic overhead of the progressive scan method,when 25%,50%and 75%lines of the cache need to be written back.The consistency write back performance is proportional to the number of lines under concern,regardless of the cache capacity.
分 类 号:TP333[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.12.146.87