基于申威26010处理器的扩展函数库实现与优化  被引量:10

Implementation and Optimization of Extended Function Library Based on SW26010 Processor

在线阅读下载全文

作  者:曹代 郭绍忠 张辛 

机构地区:[1]数学工程与先进计算国家重点实验室,郑州450002

出  处:《计算机工程》2017年第1期61-66,71,共7页Computer Engineering

基  金:国家"863"计划项目(2009AA012201)

摘  要:Intel,AMD和IBM都具有针对自身特点的向量扩展库。相比于传统的标量计算,向量化技术带来的加速比较高。为此,针对申威26010处理器开发向量数学库软件。在分析函数常用级数法和迭代法算法的基础上,结合三角函数、反三角函数、指数函数和对数函数研究一种高效向量化算法,并对其进行实现与优化,使其支持函数高精度和高性能计算,并且满足浮点运算的要求。测试结果表明,该算法精度达到申威26010处理器上特定应用的要求,与Intel VML数学库相比,各函数的平均加速比均达到1.1以上。Intel,AMD and IBM have their vector extension libraries which accord with their own features.Compared with traditional scalar calculation,the speedup of vectorization technology is higher.Therefore,this paper develops a set of vector math library software for SW26010 processor.Based on the analysis of function commonly used,like series method and iterative algorithm,combined with the trigonometric function,inverse trigonometric function,exponential function and logarithmic function,it researches an efficient vectorization algorithm and carries out realization and optimization.This algorithm supports high precision and high performance calculation,and meets the requirements of floating-point calculation.Test result shows that the precision of the proposed function library satisfies specific application requirements of SW26010.Compared with the Intel VML math library,all functions' performance improvements are more than 1.1 on average.

关 键 词:浮点运算 数学函数 申威26010处理器 数据分段 指令调度 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象