基于异构平台的稀疏矩阵向量乘自适应计算优化  

Adaptive computing optimization of sparse matrix-vector multiplication based on heterogeneous platforms

在线阅读下载全文

作  者:李博 黄建强[1,2] 黄东强 王晓英 LI Bo;HUANG Jianqiang;HUANG Dongqiang;WANG Xiaoying(Department of Computer Technology and Applications,Qinghai University,Xining Qinghai 810016,China;Qinghai Provincial Laboratory ofIntelligent Computing and Applications(Qinghai University),Xining Qinghai 810016,China)

机构地区:[1]青海大学计算机技术与应用系,西宁810016 [2]青海省智能计算与应用实验室(青海大学),西宁810016

出  处:《计算机应用》2024年第12期3867-3875,共9页journal of Computer Applications

基  金:青海省应用基础研究计划项目(2022-ZJ-701);国家自然科学基金资助项目(62062059)。

摘  要:稀疏矩阵向量乘(SpMV)是一种重要的数值线性代数运算,现有的优化存在预处理及通信时间考虑不全面、存储结构不具有普适性等问题。为了解决这些问题,提出异构平台下SpMV的自适应优化方案。所提方案利用皮尔逊相关系数确定相关度高的特征参数,并使用基于梯度提升决策树(GBDT)的极端梯度提升(XGBoost)和轻量级梯度提升(LightGBM)算法训练预测模型,以确定某一稀疏矩阵更优的存储格式。利用网格搜索确定模型训练时更优的模型超参数,使这2种算法选择更适合的存储结构的准确率都超过85%。此外,对于预测存储结构为混合(HYB)格式的稀疏矩阵,在GPU和CPU上分别计算其中的等长列(ELL)与坐标(COO)存储格式部分,建立基于CPU+GPU的并行混合计算模式;同时为小数据量的稀疏矩阵选择硬件平台,提高运算速度。实验结果表明,自适应计算优化相较于cuSPARSE库中的压缩稀疏行(CSR)存储格式计算的平均加速比可以达到1.4,相较于按照HYB和ELL存储格式计算的平均加速比则可以分别达到2.1和2.6。Sparse Matrix-Vector multiplication(SpMV)is an important numerical linear algebraic operation.The existing optimizations for SpMV suffer from issues such as incomplete consideration of preprocessing and communication time,lack of universality in storage structures.To address these issues,an adaptive optimization scheme for SpMV on heterogeneous platforms was proposed.In the proposed scheme,the Pearson correlation coefficients were utilized to determine highly correlated feature parameters,and two Gradient Boosting Decision Tree(GBDT)based algorithms eXtreme Gradient Boosting(XGBoost)and Light Gradient Boosting Machine(LightGBM)were employed to train prediction models to determine the optimal storage format for a certain sparse matrix.The use of grid searches to identify better model hyperparameters for model training resulted in both of those algorithms achieving more than 85%accuracy in selecting a more suitable storage structure.Furthermore,for sparse matrices with the HYBrid(HYB)storage format,the ELLPACK(ELL)and COOrdinate(COO)storage format parts in these metrices were computed on the GPU and CPU separately,establishing a CPU+GPU parallel hybrid computing mode.At the same time,hardware platforms were also selected for sparse matrices with small data sizes to improve computational speed.Experimental results demonstrate that the adaptive computing optimization achieves an average speedup of 1.4 compared to the Compressed Sparse Row(CSR)storage format in cuSPARSE library,and average speedup of 2.1 and 2.6 compared to the HYB and ELL storage formats,respectively.

关 键 词:稀疏矩阵向量乘 自适应优化 皮尔逊相关系数 极端梯度提升 轻量级梯度提升机器学习 

分 类 号:TP311.1[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象