检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孟凡丰 王子聪 张金涛 王彦景 欧洋[1] 吴利舟 肖侬[1] MENG Fanfeng;WANG Zicong;ZHANG Jintao;WANG Yanjing;OU Yang;WU Lizhou;XIAO Nong(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,Hunan,China)
机构地区:[1]国防科技大学计算机学院,湖南长沙410073
出 处:《计算机工程》2025年第3期180-188,共9页Computer Engineering
基 金:国家重点研发计划(2022YFB4500304);国家自然科学基金(62332021,62304257)。
摘 要:大数据时代的各类数据中心应用程序对大规模数据的存储与计算需求越来越大,海量数据的访问开销成为限制应用程序性能的主要瓶颈,计算快速链路(CXL)互联协议的出现为这一问题提供了新的解决思路。提出一种基于CXL的内存池系统软硬件设计。在硬件层面,基于CXL扩展内存协议,在系统结构模拟器gem5上构建CXL扩展内存设备。通过将设备内存暴露在中央处理器(CPU)地址空间内,使得CPU可以直接使用load/store指令访问设备内存。在操作系统层面,编写CXL设备的驱动程序,为管理和访问设备提供了完整的软件栈。通过在用户态使用memkind库整合主机与设备内存,从而向应用程序提供统一的内存视图。通过gem5的全系统模式搭建完整的CXL扩展内存池原型系统,对系统进行全面的性能评估。使用基准测试membench和STREAM对主机本地动态随机存取内存(DRAM)与主机管理设备内存(HDM)进行了延迟和带宽的对比分析,实验结果显示:HDM延迟约为DRAM的1.5倍,HDM的带宽约为DRAM的50%~63%。此外,在DRAM和HDM上运行了真实的键值存储引擎Viper,发现在DRAM容量受限的场景下,使用扩展的HDM有2~7倍的性能提升。With the advent of the era of big data,the demand for large-scale data storage and high-performance computing in data center applications is rapidly increasing.This growing need has made the access cost of massive data a significant bottleneck affecting application performance.The emergence of the Compute Express Link(CXL)interconnection protocol offers a promising solution to this challenge.This study introduces a design for a CXL extended memory pool.At the hardware level,a CXL extended memory pool system using the CXL extended memory protocol is implemented in gem5.When device memory is exposed to the CPU address space,the CPU can directly access this memory using standard load/store instructions.At the operating system level,the study develops a CXL device driver,which provides a comprehensive software stack for managing and accessing the device.In addition,utilizing the memkind library in user mode,the study integrates host and device memory to deliver a unified memory view to applications.The study builds a complete prototype of the CXL extended memory pool system based on the full system mode of gem5 and conducts a thorough evaluation of its performance.The study also compares the latency and bandwidth of host local Dynamic Random Access Memory(DRAM)and Host-managed Device Memory(HDM)using the membench and STREAM benchmarks.Experimental results show that the latency of HDM is approximately 1.5 times that of DRAM,whereas under various application scenarios,the bandwidth of HDM ranges from 50%to 63%of that of DRAM.Simultaneously,this study runs the key-value storage engine known as Viper on both DRAM and HDM and finds that in scenarios with constrained DRAM capacity,the use of extended HDM can significantly enhance system performance by a factor of 2 to 7 times.
关 键 词:gem5模拟器 LINUX驱动 快速计算链路 内存池 数据中心
分 类 号:TP302.1[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.117.153.108