基于申威1621的高精度点积算法实现与优化  

Implementation and Optimization of High-precision Dot Product Algorithm Based on SW1621 Processor

在线阅读下载全文

作  者:徐方洁 王磊[1] 王一卓 张亚光 XU Fang-Jie;WANG Lei;WANG Yi-Zhuo;ZHANG Ya-Guang(Research Institute of Frontier Information Technology,Zhongyuan University of Technology,Zhengzhou 450007,China)

机构地区:[1]中原工学院前沿信息技术研究院,郑州450007

出  处:《计算机系统应用》2023年第2期400-405,共6页Computer Systems & Applications

摘  要:点积函数是BLAS库中的一级基础函数,其被科学计算等领域广泛调用.由于浮点计算会引入舍入误差,现有BLAS库中双精度点积函数不足以满足某些应用领域的精度要求,因此需要高精度算法来实现更精确可靠的计算.在本文中,面向国产申威1621平台,在现有的BLAS库的基础上,新增高精度点积函数的实现接口,来满足应用的高精度需求.同时,对于高精度点积算法运用循环展开、访存优化、指令重排等优化策略,实现汇编级手工优化.实验结果显示,文中高精度点积算法的计算结果精度,近似达到了双精度点积的两倍,有效提升了原始算法精度.同时,在保证精度提升的基础上,文中优化后的高精度点积函数相比未优化前,平均性能加速比达到了1.61.The dot product function is a first-level basic function in the BLAS library,which is widely called by scientific calculations and other fields.As the floating-point calculation introduces rounding errors,the double-precision dot product is unable to meet the accuracy requirements in some application fields,and thus high-precision algorithms are needed to achieve more accurate and reliable calculations.In this study,on the basis of the existing BLAS library,the interface of the high-precision dot product function is added to meet the high-precision requirements of applications on the domestic SW1621 platform.At the same time,the high-precision dot product algorithm uses such optimization strategies as loop expansion,visit-memory optimization,and instruction rearrangement to realize assembly-level manual optimization.The experimental results indicate that the high-precision dot product algorithm has the accuracy approximately twice that of the double-precision dot product,which effectively improves the precision of the original algorithm.On this basis,the average performance speedup of the high-precision dot product function reaches 1.61 after optimization.

关 键 词:申威1621 点积 高精度 BLAS库接口 性能优化 

分 类 号:TP332[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象