检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李冰[1] 午康俊 王晶 李森[4] 高岚 张伟功[2] 倪天明 LI Bing;WU Kangjun;WANG Jing;LI Sen;GAO Lan;ZHANG Weigong;NI Tianming(College of Information Engineering,Capital Normal University,Beijing 100048,China;Academy for Multidisciplinary Studies,Capital Normal University,Beijing 100048,China;School of Information,Renmin University of China,Beijing 100048,China;China Academy of Launch Vehicle Technology,Beijing 100076,China;College of Electrical Engineering,Anhui Polytechnic University,Wuhu 241000,China)
机构地区:[1]首都师范大学交叉科学研究院,北京100048 [2]首都师范大学信息工程院,北京100048 [3]人民大学信息学院,北京100048 [4]中国运载火箭技术研究院,北京100076 [5]安徽工程大学集成电路与系统研究所,芜湖241000
出 处:《电子与信息学报》2023年第1期106-115,共10页Journal of Electronics & Information Technology
基 金:国家自然科学基金(62174001,61904001);安徽省重点研究与开发计划(202104b11020032);安徽工程大学中青年拔尖人才计划。
摘 要:图卷积神经网络(GCN)在社交网络、电子商务、分子结构推理等任务中的表现远超传统人工智能算法,在近年来获得广泛关注。与卷积神经网络(CNN)数据独立分布不同,图卷积神经网络更加关注数据之间特征关系的提取,通过邻接矩阵表示数据关系,因此其输入数据和操作数相比卷积神经网络而言都更加稀疏且存在大量数据传输,所以实现高效的GCN加速器是一个挑战。忆阻器(ReRAM)作为一种新兴的非易失性存储器,具有高密度、读取访问速度快、低功耗和存内计算等优点。利用忆阻器为CNN加速已经被广泛研究,但是图卷积神经网络极大的稀疏性会导致现有加速器效率低下,因此该文提出一种基于忆阻器交叉阵列的高效图卷积神经网络加速器,首先,该文分析GCN中不同操作数的计算和访存特征,提出权重和邻接矩阵到忆阻器阵列的映射方法,有效利用两种操作数的计算密集特征并避免访存密集的特征向量造成过高开销;进一步地,充分挖掘邻接矩阵的稀疏性,提出子矩阵划分算法及邻接矩阵的压缩映射方案,最大限度降低GCN的忆阻器资源需求;此外,加速器提供对稀疏计算支持,支持压缩格式为坐标表(COO)的特征向量输入,保证计算过程规则且高效地执行。实验结果显示,该文加速器相比CPU有483倍速度提升和1569倍能量节省;相比GPU也有28倍速度提升和168倍能耗节省。Graph Convolutional Networks(GCNs) have superior performance in tasks such as social networking,ecommerce, molecular structure reasoning relative to traditional artificial intelligence algorithms, and have gained intensive attention in recent years. Unlike the independent distribution of data in Convolutional Neural Networks(CNNs), GCNs pay more attention to extract feature relationships between data, which is represented by the adjacency matrix. Therefore, the input data and operands in GCNs are much sparse and there are a large amount of data transmission, which makes it a challenge to implement an efficient GCN accelerator. Resistive Random Access Memory(ReRAM) as a new type of non-volatile memory has the advantages of high density, fast read access, near-zero leakage power and processing in-memory. Using ReRAM to accelerate CNNs has been widely studied. However, the extreme sparsity of GCNs makes it inefficiency to deploy on existing accelerators. In this work, a GCN accelerator based on ReRAM is proposed. First, the calculation and memory access characteristics of different operands in the GCN are analyzed, and a novel weight and adjacency matrix mapping policy is proposed by exploiting the intensive computing characteristic of weight and adjacency matrix, so that avoiding the excessive overhead caused by massive memory accesses;As for the extremely sparse adjacency matrix, a sub-matrix partitioning algorithm and a compression mapping scheme are proposed to minimize the GCN’s ReRAM resource requirements;Moreover, efficient processing on the sparse input feature vector with COOrdinate list(COO) compression format is provided by the proposed accelerator and the regular and efficient execution with the input feature vector are ensured. Experimental results show that the proposed work achieves 483 times speedup and 1569 times energy saving compared to CPU, and achieves 28 times speedup and consumes 168 times less energy over the GPU.
关 键 词:存算一体 新型非易失性存储器 图卷积神经网络 加速器
分 类 号:TN929.5[电子电信—通信与信息系统] TN601[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229