检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Mingtian SHAO Kai LU Wanqing CHI Ruibo WANG Yiqin DAI Wenzhe ZHANG
机构地区:[1]College of Computer,National University of Defense Technology,Changsha 410073,China
出 处:《Frontiers of Information Technology & Electronic Engineering》2022年第11期1631-1645,共15页信息与电子工程前沿(英文版)
基 金:Project supported by the National Natural Science Foundation of China(No.61902405);the Tianhe Supercomputer Project of China(No.2018YFB0204301);the PDL Research Fund of China(No.6142110190404);the National High-Level Personnel for Defense Technology Program,China(No.2017-JCJQ-ZQ-013)。
摘 要:High-performance computing(HPC)systems are about to reach a new height:exascale.Application deployment is becoming an increasingly prominent problem.Container technology solves the problems of encapsulation and migration of applications and their execution environment.However,the container image is too large,and deploying the image to a large number of compute nodes is time-consuming.Although the peer-to-peer(P2P)approach brings higher transmission efficiency,it introduces larger network load.All of these issues lead to high startup latency of the application.To solve these problems,we propose the topology-aware execution environment service(TEES)for fast and agile application deployment on HPC systems.TEES creates a more lightweight execution environment for users,and uses a more efficient topology-aware P2P approach to reduce deployment time.Combined with a split-step transport and launch-in-advance mechanism,TEES reduces application startup latency.In the Tianhe HPC system,TEES realizes the deployment and startup of a typical application on 17560 compute nodes within 3 s.Compared to container-based application deployment,the speed is increased by 12-fold,and the network load is reduced by 85%.
关 键 词:Execution environment Application deployment High-performance computing(HPC) CONTAINER Peer-to-peer(P2P) Network topology
分 类 号:TP315[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249