检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:崔丽群[1] 胡磊 CUI Liqun;HU Lei(College of Software,Liaoning Technical University,Huludao 125105,China)
机构地区:[1]辽宁工程技术大学软件学院,辽宁葫芦岛125105
出 处:《软件导刊》2024年第6期44-52,共9页Software Guide
基 金:辽宁省高等学校基本科研项目(LJKMZ20220699)。
摘 要:近几年卷积神经网络作为深度学习最重要的技术,在图像分类、物体检测、语音识别等领域均有所建树。在此期间,由多层卷积神经网络组成的深度神经网络横空出世,在各种任务准确性方面具有显著提升。然而,神经网络的权重往往被限定在单精度类型,使网络体积相较于特定硬件平台上的内存空间更大,且floating point 16、INT 8等单精度类型已无法满足现在一些模型推理的现实需求。为此,提出一种以子图为最小单位,通过判断相邻结点之间的融合关系,添加了丰富比特位的混合精度推理算法。首先,在原有单精度量化设计的搜索空间中增加floating point 16半精度的比特配置,使最终搜索空间变大,为寻找最优解提供更多机会。其次,使用子图融合的思想,通过整数线性规划将融合后的不同子图精度配置,根据模型大小、推理延迟和位宽操作数3个约束对计算图进行划分,使最后累积的扰动误差减少。最终,在ResNet系列网络上验证发现,所提模型精度相较于HAWQ V3的损失没超过1%的同时,相较于其他混合精度量化方法在推理速度方面得到了提升,在ResNet18网络中推理速度分别提升18.15%、19.21%,在ResNet50网络中推理速度分别提升13.15%、13.70%。In recent years,convolutional neural networks,as the most important technology in deep learning,have made achievements in fields such as image classification,object detection,and speech recognition.During this period,deep neural networks composed of multi-lay-er convolutional neural networks emerged,showing significant improvements in accuracy in various tasks.However,the weights of neural net-works are often limited to single precision types,resulting in a larger memory space compared to specific hardware platforms,and single preci-sion types such as floating point 16 and INT 8 can no longer meet the practical needs of some model inference today.To this end,a mixed pre-cision inference algorithm is proposed,which uses subgraphs as the minimum unit and adds rich bits by judging the fusion relationship be-tween adjacent nodes.Firstly,adding a floating point 16 semi precision bit configuration to the search space of the original single precision quantization design increases the final search space,providing more opportunities for finding the optimal solution.Secondly,using the idea of subgraph fusion,the accuracy of different fused subgraphs is configured through integer linear programming.The computational graph is divid-ed based on three constraints:model size,inference delay,and bitwidth operands,reducing the accumulated disturbance error in the end.In the end,it was verified on the ResNet series network that the proposed model had an accuracy loss of no more than 1%compared to HAWQ V3,while also improving inference speed compared to other mixed precision quantization methods.In the ResNet18 network,the inference speed was improved by 18.15%and 19.21%,respectively,and in the ResNet50 network,the inference speed was improved by 13.15%and 13.70%,respectively.
关 键 词:子图融合 混合精度推理 约束问题最优化求解 GPU加速
分 类 号:TN911.73[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7