检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孙成国 兰静[2] 姜浩[1] SUN Cheng-guo;LAN Jing;JIANG Hao(College of Computer,National University of Defense Technology,Changsha 410073;Rongzhi College,Chongqing Technology and Business University,Chongqing 404100,China)
机构地区:[1]国防科技大学计算机学院,湖南长沙410073 [2]重庆工商大学融智学院,重庆404100
出 处:《计算机工程与科学》2018年第5期798-804,共7页Computer Engineering & Science
基 金:国家863项目(2012AA01A301);国家自然科学基金(61402495;61303189;61602166;61170049;61402496);重庆市教育科学规划课题重点项目(2015-GX-036)
摘 要:基于OpenBLAS和BLIS开源线性代数基础算法库,对稠密矩阵乘法GEMM运算的性能优化展开研究。针对如何选取稠密矩阵分块并行算法的关键分块参数这一问题,建立性能优化模型。采用改进的遗传算法求解上述优化模型,将某一分块参数组合(种群个体)所对应的稠密矩阵乘法的性能值作为该个体的适应度,通过不断迭代地进行选择、交叉、变异操作,找到最优的分块参数组合,使得稠密矩阵运算的性能值最优。数值实验表明,基于遗传算法求解得出最优分块参数下的GEMM性能值优于默认分块参数下的性能值,达到了优化的目的。Based on OpenBLAS and BLIS, the two open source linear algebra libraries, the performance optimization of dense matrix multiplication (GEMM) operation is studied. Aiming at how to select the key block parameters of GEMM, a performance optimization model is established. An improved genetic algorithm is used to solve the above performance optimization model. The performance value of the GEMM corresponding to a certain parameter combination (individual) is taken as the fitness of the individual. The optimal combination of block parameters is found through continuous iterative selection, crossover and mutation operations in order to make the performance of GEMM optimal. Numerical experiments show that the performance of GEMM based on genetic algorithm is better than the performance under the initial block parameters, and hence the optimization is achieved.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28