基于天河互连的并行文件系统网络驱动  

Parallel file system network driver based on Tianhe inter-connection system

作  者:董勇[1] 邬会军 杨梨花 张伟[1] 王睿伯[1] 周恩强[1] DONG Yong;WU Huijun;YANG Lihua;ZHANG Wei;WANG Ruibo;ZHOU Enqiang(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)

机构地区:[1]国防科技大学计算机学院,湖南长沙410073

出  处:《计算机工程与科学》2025年第3期392-399,共8页Computer Engineering & Science

基  金:国家重点研发计划(2021YFB0300101);国防科技大学HPCL重点实验室项目(202101-03)。

摘  要:并行文件系统是高性能计算机系统软件栈的重要组成部分。面向高速网络的驱动是并行文件系统提供高效数据访问的关键环节。设计实现了基于天河高速互连网络TH-Express的并行文件网络驱动GLND,在并行化、通信协议以及容错3方面进行了有针对性的优化,采用VP粒度并行,配合适度均衡的流水线划分,实现了高吞吐率;根据消息大小差异等因素自适应地选择底层通信协议,实现NUMA感知的内存管理机制;通过自适应调节的超时机制来避免软件异常超时对通信操作的影响。实验结果表明,在相同硬件条件下,GLND相比于TCP的写带宽平均提升了23.69%,读带宽平均提升了79.25%。Parallel file system is an essential component of the software stack in high performance computing systems.The driver designed for high-speed networks is a crucial aspect of parallel file systems in providing efficient data access.A parallel file network driver based on the Tianhe high-speed interconnect network(TH-Express),named GLND,has been designed and implemented.GLND has been optimized specifically in three areas:parallelization,communication protocol,and fault tolerance.It achieves high throughput through VP-level parallelism combined with appropriately balanced pipeline partitioning.It adaptively selects the underlying communication protocol based on factors such as message size differences,implementing a NUMA-aware memory management mechanism.Additionally,an adaptively adjustable timeout mechanism is employed to avoid the impact of abnormal timeouts at the software layer on the completion of communication operations.Experimental results show that under the same hardware conditions,GLND improves write bandwidth by an average of 23.69%and read bandwidth by an average of 79.25%compared to TCP.

关 键 词:并行文件系统 互连网络 网络编程接口 

分 类 号:TP302[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象