基于GPU的自适应邻域压缩禁忌搜索的软硬件划分算法  被引量:2

GPU-based adaptive compacting neighborhood tabu search for hardware/software partitioning

在线阅读下载全文

作  者:侯能 何发智[1,2] 周毅 陈壹林 Neng HOU;Fazhi HE;Yi ZHOU;Yilin CHEN(State Key Laboratory of Software Engineering,Wuhan University,Wuhan 430072,China;School of Computer Science,Wuhan University,Wuhan 430072,China;School of Information Science and Engineering,Wuhan University of Science and Technology,Wuhan 430081 China)

机构地区:[1]武汉大学软件工程国家重点实验室,武汉430072 [2]武汉大学计算机学院,武汉430072 [3]武汉科技大学信息科学与工程学院,武汉430081

出  处:《中国科学:信息科学》2018年第8期978-999,共22页Scientia Sinica(Informationis)

基  金:国家自然科学基金(批准号:61472289);湖北省自然科学基金(批准号:2015CBF254)资助项目

摘  要:软硬件划分是软硬件协同设计中的关键步骤,决定了哪些功能由硬件执行,哪些功能由软件执行.软硬件划分属于NP难问题.现代嵌入式系统的复杂性提高,造成软硬件划分问题规模变大,需要采用启发式方法求解.禁忌搜索是求解软硬件划分的有效方法.然而,算法的求解过程非常耗时.已有的禁忌搜索求解软硬件划分是串行实现,要折中考虑解的质量和算法的运行时间.这种考虑牺牲了解的质量.本文提出基于GPU的自适应邻域压缩(compacting neighborhood)禁忌搜索的软硬件划分算法.首先,提出自适应策略.自适应策略能够增强算法的搜索集中性,提高解的质量.GPU的大规模并行特性可以降低算法的运行时间.其次,为了使算法在GPU上高效地执行,提出基于GPU的任务图表达、线程–候选解映射、数据布局和访存等一系列优化策略.最后,实验采用统一设备架构(CUDA)编程,并根据相关基准任务图,通过不同的计算–通信比和实时约束条件,对提出的方法进行验证.结果表明,本文方法的解质量要优于已有的方法.对比将自适应邻域压缩禁忌搜索自然移植到GPU后的运行时间,提出的GPU上的执行优化策略明显地降低了求解时间.另外,在更大规模的软硬件划分上验证了基于GPU的方法在时间上的优势.Hardware/software(HW/SW) partitioning is an essential step in hardware/software co-design as it determines the functions to be implemented in the hardware and software.HW/SW partitioning problems are NP hard.The increasing complexities of modern embedded systems worsen the problem,thus motivating the search for solutions by heuristics.The tabu search algorithm is an effective method for HW/SW partitioning.However,the process is time consuming.The existing tabu search methods for HW/SW partitioning focus on sequential implementations that involve a trade-off between solution time and solution quality.This trade-off sacrifices the solution quality.This paper presents a GPU-based adaptive compacting neighborhood tabu search for HW/SW partitioning.First,we present an adaptive strategy that enhances the search intensification to improve the solution quality.The massive parallelism of GPU architecture can reduce the solution time of the proposed strategy.Next,to ensure that the algorithm execute efficiently on the GPU,we further propose several implementation strategies such as the representation of task graph on the GPU,the mapping between the GPU thread and candidate,and data layout and memory access optimization.Finally,we realize the proposed method in a computing unified device architecture,namely CUDA,and verify the effectiveness according to the related benchmark with different communication-to-computation ratios and real-time constraint requirements.The results show that the proposed method can yield a better solution quality compared to the existing methods.By comparing with the naive GPU implementation of adaptive compacting neighborhood tabu search,the proposed implementation strategies on the GPU significantly reduced the solution time.In addition,we verified the advantage of the GPU-based method for very large HW/SW partitioning.

关 键 词:软硬件协同设计 启发式方法 图形处理单元 禁忌搜索 自适应算法 

分 类 号:TP368.1[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象