检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:冯伟[1] 姜远飞[1] FENG Wei;JIANG Yuanfei(Institute of Atomic and Molecular Physics,Jilin University,Changchun 130012,China)
机构地区:[1]吉林大学原子与分子物理研究所,长春130012
出 处:《吉林大学学报(信息科学版)》2023年第2期381-386,共6页Journal of Jilin University(Information Science Edition)
基 金:吉林大学2019实验技术基金资助项目(11974136)。
摘 要:为解决高性能集群监控管理中,系统异常监测受时间、地点限制,集群管理员无法及时发现集群异常从而影响集群系统正常运行等问题,利用企业微信的开放功能和消息传送机制,结合Linux(GNU/Linux)操作系统集群监控管理方法,开发了适合中小型集群的简单易用,并极易扩展的集群监控管理系统,实现了手机端预警信息呈现功能。阐述了系统需求、系统框架和功能设计、技术框架和数据流,以及系统部署与开发实现的具体过程。目前系统已开发完毕,应用于吉林大学原子与分子物理研究所的日常集群管理中。集群管理员和用户可以在不登录集群节点的情况下,通过手机端APP(Application)监控到集群系统的软硬件性能和作业完成状态,便于及时进行后续处理工作。尤其在疫情期间,居家办公,集群访问不便捷的情况下,该功能的实施辅助了吉林大学原子与分子物理研究所科研工作的高效进行。In order to solve the problems of high-performance cluster monitoring and management, such as system monitoring is restricted by time and place, which causes cluster administrators to be unable to find cluster abnormal situations in time and affects the running of the cluster system, the open function and message transmission mechanism of WeChat are used in combination with the cluster monitoring and management method of Linux(GNU/Linux) operating system, a kind of simple and easy-to-use cluster monitoring and management system is developed. It is suitable for small and medium-sized clusters with the ability to expand easily. We mainly expound the system requirements, system framework and function design, technical framework and data flow, as well as the specific process of system deployment and development. At present, the system has been developed and applied in the cluster monitoring management of the institute and molecular physics of Jilin University, and has achieved good application results. The cluster administrator and users can monitor the cluster performance and job completion status through APP(Application) on the mobile phone without login system, so as to facilitate the follow-up work in time. Especially during the COVID-19 period, when the cluster access is not convenient, the implementation of this function has assisted the efficient scientific research work of the institute.
关 键 词:企业微信 高性能计算集群 性能监控 作业管理 消息传送
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.218.108.184