检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王洁[1] 付丹阳 WANG Jie;FU Dan-yang(School of Software,Dalian University of Technology,Dalian 116081;Beijing Institute of Open Source Chip,Beijing 100085,China)
机构地区:[1]大连理工大学软件学院,辽宁大连116081 [2]北京开源芯片研究院,北京100085
出 处:《计算机工程与科学》2024年第7期1185-1192,共8页Computer Engineering & Science
摘 要:RISC-V指令集具有灵活可扩展的优势,向量扩展是其扩展指令集之一。在实现向量扩展时需要将向量指令拆分成多条微指令,如果每条微指令都占用一项重排序缓存(ROB),会存在一定的信息冗余,并且会减少CPU中并行执行的指令(in-flight指令)数量,影响处理器性能。基于指令与微指令在ROB中的存储解耦方法,使用一个新的队列(RAB)存储每条微指令的目的寄存器的重命名映射关系等信息,每项ROB只存储其对应指令拆分的微指令的公共信息,ROB与RAB分别控制指令与微指令的提交与回滚,减少了存储信息冗余,缓解了由向量指令拆分的微指令过多导致的in-flight指令数量减少问题。在上述方法的基础上,同时实现了标量指令的ROB压缩,在ROB项数不变的情况下,增加了in-flight指令的最大数量。最终的仿真结果表明,此方法有效提高了处理器性能。RISC-V instruction set has the advantages of flexibility and scalability,and vector extension is one of its extended instruction sets.When implementing vector extention,it is necessary to split the vector instruction into multiple microinstructions.If each microinstruction occupies a reordering buffer(ROB)entry,there will be certain information redundancy,and will reduce the number of instructions executed in parallel(in-flight instructions)in the CPU,affecting processor performance.Based on the method of decoupling the storage of instructions and microinstructions in ROB,a new queue RAB is used to store information such as the renaming mapping relationship of the destination register of each microinstruction,and each ROB stores only the common information of the microinstructions derived from its corresponding instruction.ROB and RAB respectively control the commit and walk of instructions and microinstructions,which reduces the redundancy of stored information and alleviates the problem caused by too many microinstructions for vector instruction splitting.On the basis of the above method,this paper implements the ROB compression of scalar instructions at the same time,increasing the maximum number of in-flight instructions with the same number of ROB entries.The final simulation results show that this method effectively improves the performance of the processor.
分 类 号:TP332[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.228.200