基于天河互连的公共通信接口UCX实现与评估  被引量:2

Implementation and evaluation of UCX communication interface on TH-express interconnection

在线阅读下载全文

作  者:谢旻[1,2] 周恩强 董勇[1,2] 张伟[1] XIE Min;ZHOU Enqiang;DONG Yong;ZHANG Wei(College of Computer Science and Engineering, National University of Defense Technology, Changsha Hunan 410073, China;State Key Laboratory of High Performance Computing, Changsha Hunan 410073, China)

机构地区:[1]国防科技大学计算机学院,长沙410073 [2]高性能计算国家重点实验室,长沙410073

出  处:《计算机应用》2019年第A01期113-118,共6页journal of Computer Applications

基  金:国家重点研发计划项目(2016YFB0200401);国防科技大学科研计划项目(ZK18-03-10)

摘  要:为解决在天河互连和未来高性能互连网络上支持多种高性能、可扩展并行编程模型的问题,提出了一种基于远程直接内存访问(RDMA)技术的公共通信接口UCX实现方案。该UCX实现系统建立了UCX数据抽象到天河互连系统通信资源对象的映射关系,基于短报文通信和共享的RDMA缓冲池实现了一种面向ActiveMessage和单边通信接口的高速数据传输协议,并提出了一种动态可扩展信用流控机制来提高UCX系统在大规模并行应用运行时的可扩展性。实验测试测试表明,由于UCX通信接口操作更匹配互连网络硬件特性,精简了软件处理层次,UCX软件层增加的总开销小于200ns,而基于该UCX的消息传递接口(MPI)系统,相对于现有的天河互连MPI实现架构,减少了约50ns的通信延迟,短消息速率也有约10%的性能提升。该UCX实现系统对拓展天河互连网络上的并行编程模型和应用类型,并确保并行运行效率,提供了较好的技术支撑。To support multiple parallel programming models and ensure their efficiency and scalability in TH-Express interconnection network and future high speed interconnection network, an implementation of common communication interface UCX (Unified Communication X Framework) using RDMA (Remote Direct Memory Access)data transfer was proposed. In this UCX system, the mapping of UCX semantics to TH-Express communication mechanisms was implemented. To accelerate the data transfer of UCX active message and remote memory access interfaces, a high speed pipeline communication protocol using mini packet transfer and shared RDMA buffer pool was designed and implemented. A dynamic credit based flow control mechanism was also proposed to improve the scalability of UCX when running large-scale parallel applications. The early evaluation shows that, because of the close matching of UCX interfaces and the interconnection hardware characteristic, UCX has low software layer overhead. The increased communication latency of UCX is no more than 200 ns. Compared to current TH-Express MPI (Message Passing Interface) system, the communication latency in MPI based on UCX decreased about 50 ns, and the message rate improved about 10%. This UCX implementation system provides a good foundation for supporting various parallel programming models and applications efficiently on TH-Express interconnection network.

关 键 词:高速互连网络 并行编程模型 消息传递接口 公共通信接口 远程直接内存访问 

分 类 号:TP316.4[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象