一种面向FAST PB量级脉冲星数据处理加速方法及系统  被引量:5

A Data Processing Acceleration Method and System for FAST Petabyte Pulsar Data Processing

在线阅读下载全文

作  者:张辉[1,3,4] 谢晓尧 李菂[2,5] 刘志杰 王培 于徐红[1,4] 游善平 许余云[6] 姜家涛[1,3,4] Zhang Hui;Xie Xiaoyao;Li Di;Liu Zhijie;Wang Pei;Yu Xuhong;You Shanping;Xu Yuyun;Jiang Jiatao(Guizhou Key Laboratory of Information and Computing Science,Guizhou Normal University,Guiyang 550001,China;National Astronomical Observatories,Chinese Academy of Sciences,Beijing 100101,China;School of Mathematics and Sciences,Guizhou Normal University,Guiyang 550001,China;FAST Early Science Data Center,Guiyang 550001,China;University of Chinese Academy of Sciences,Beijing 100049,China;Institute of Management Engineering,Guizhou Vocational and Technical College of Water Resources and Hydropower,Guiyang 551416,China)

机构地区:[1]贵州师范大学贵州省信息与计算科学重点实验室,贵州贵阳550001 [2]中国科学院国家天文台,北京100101 [3]贵州师范大学数学科学学院,贵州贵阳550001 [4]FAST早期科学数据中心,贵州贵阳550001 [5]中国科学院大学,北京100049 [6]贵州水利水电职业技术学院管理工程分院,贵州贵阳551416

出  处:《天文研究与技术》2021年第1期129-137,共9页Astronomical Research & Technology

基  金:国家自然科学基金(U1831131,U1631132,U1731238,11743002);中国科学院天文大科学研究中心FAST重大成果培育项目(FAST[2019sr04]);国家重点研发计划(2017YFA0402600);中国科学院战略性先导科技专项(B类)(XDB23000000);2017贵州省科技厅联合基金(黔科合LH字[2017]7338号);贵州师范大学研究生创新基金(研创201528)资助.

摘  要:500 m口径球面射电望远镜(Five-hundred-meter Aperture Spherical Telescope,FAST)已投入科学运行,其中脉冲星漂移扫描巡天采集数据量已达数PB,预计每年至少新增5 PB。现有的数据处理软件如PRESTO,SIGPROC等无法满足PB量级数据的快速处理要求。提出了一种基于PRESTO的分布式并行计算方法,整合利用数据库技术和异地异构计算资源,构建了一套命名为Craber的计算加速系统,由FAST早期科学数据中心与国家天文台共同设计实现。启用Craber子网计算集群D中55个计算节点,应用澳大利亚帕克斯(Parkes)望远镜多波束巡天数据集和500 m口径球面射电望远镜漂移扫描数据验证了系统流程和搜索数据库。单个100 MB帕克斯巡天数据文件平均耗时36 s,单个128 MB 500 m口径球面射电望远镜巡天数据文件平均耗时22 s。该系统目前已实际参与数据处理并发现了数十颗脉冲星,有效帮助500 m口径球面射电望远镜加速数据处理和扩大新样本数量。The Five-hundred-meter Aperture Spherical radio Telescope(FAST)has started normal science operation.Data collected by drift scan pulsar survey has exceeded 1 PB,and it is expected to further increase by at least 5 PB per year.Existing pulsar search software,such as PRESTO,SIGPROC,and etc.,cannot meet the real-time data analysis and management requirements.How to efficiently process PB volume of data has become a new challenge in the field of radio astronomy.In order to tackle the problems of PB data analysis and data management encountered by FAST,we,the joint team from Guizhou Normal University(GZNU)and the National Astronomical Observatories(NAOC),designed and implemented a PRESTO-based,distributed-parallel-computing system,named Craber,which integrated network technology,database,and cross-regional hardware computing resources.Craber performed well on data sets both from the Parkes Multibeam Pulsar Survey(PMPS)and the Commensal Radio Astronomy FAST Survey(CRAFTS).A 100 MB Parkes data file took~36 seconds by 55 computing nodes in sub-cluster D of Craber,while a 128 MB data file from CRAFTS cost~22 seconds.Up to date,Craber processed more than 66000 data files from FAST,helped FAST detect more than 140 high-quality candidates,114 of which have been confirmed.All resulting data products were then stored into the integrated Oracle database or dedicated file server,ready for further candidates selection with AI.Craber has already helped FAST speed up its data processing substantially and discovered a number of new pulsars.

关 键 词:脉冲星 加速系统 搜索数据库 数据处理 500米口径球面射电望远镜 

分 类 号:P162[天文地球—天文学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象