检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李博 黄建强[1,2] 黄东强 王晓英 LI Bo;HUANG Jianqiang;HUANG Dongqiang;WANG Xiaoying(Department of Computer Technology and Applications,Qinghai University,Xining Qinghai 810016,China;Qinghai Provincial Laboratory ofIntelligent Computing and Applications(Qinghai University),Xining Qinghai 810016,China)
机构地区:[1]青海大学计算机技术与应用系,西宁810016 [2]青海省智能计算与应用实验室(青海大学),西宁810016
出 处:《计算机应用》2024年第12期3867-3875,共9页journal of Computer Applications
基 金:青海省应用基础研究计划项目(2022-ZJ-701);国家自然科学基金资助项目(62062059)。
摘 要:稀疏矩阵向量乘(SpMV)是一种重要的数值线性代数运算,现有的优化存在预处理及通信时间考虑不全面、存储结构不具有普适性等问题。为了解决这些问题,提出异构平台下SpMV的自适应优化方案。所提方案利用皮尔逊相关系数确定相关度高的特征参数,并使用基于梯度提升决策树(GBDT)的极端梯度提升(XGBoost)和轻量级梯度提升(LightGBM)算法训练预测模型,以确定某一稀疏矩阵更优的存储格式。利用网格搜索确定模型训练时更优的模型超参数,使这2种算法选择更适合的存储结构的准确率都超过85%。此外,对于预测存储结构为混合(HYB)格式的稀疏矩阵,在GPU和CPU上分别计算其中的等长列(ELL)与坐标(COO)存储格式部分,建立基于CPU+GPU的并行混合计算模式;同时为小数据量的稀疏矩阵选择硬件平台,提高运算速度。实验结果表明,自适应计算优化相较于cuSPARSE库中的压缩稀疏行(CSR)存储格式计算的平均加速比可以达到1.4,相较于按照HYB和ELL存储格式计算的平均加速比则可以分别达到2.1和2.6。Sparse Matrix-Vector multiplication(SpMV)is an important numerical linear algebraic operation.The existing optimizations for SpMV suffer from issues such as incomplete consideration of preprocessing and communication time,lack of universality in storage structures.To address these issues,an adaptive optimization scheme for SpMV on heterogeneous platforms was proposed.In the proposed scheme,the Pearson correlation coefficients were utilized to determine highly correlated feature parameters,and two Gradient Boosting Decision Tree(GBDT)based algorithms eXtreme Gradient Boosting(XGBoost)and Light Gradient Boosting Machine(LightGBM)were employed to train prediction models to determine the optimal storage format for a certain sparse matrix.The use of grid searches to identify better model hyperparameters for model training resulted in both of those algorithms achieving more than 85%accuracy in selecting a more suitable storage structure.Furthermore,for sparse matrices with the HYBrid(HYB)storage format,the ELLPACK(ELL)and COOrdinate(COO)storage format parts in these metrices were computed on the GPU and CPU separately,establishing a CPU+GPU parallel hybrid computing mode.At the same time,hardware platforms were also selected for sparse matrices with small data sizes to improve computational speed.Experimental results demonstrate that the adaptive computing optimization achieves an average speedup of 1.4 compared to the Compressed Sparse Row(CSR)storage format in cuSPARSE library,and average speedup of 2.1 and 2.6 compared to the HYB and ELL storage formats,respectively.
关 键 词:稀疏矩阵向量乘 自适应优化 皮尔逊相关系数 极端梯度提升 轻量级梯度提升机器学习
分 类 号:TP311.1[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.21.106.4