出 处:《计算机科学》2025年第5期67-75,共9页Computer Science
基 金:并行与分布计算全国重点实验室基金(2023-KJWPDL-01)。
摘 要:最优线程数设置是影响多线程程序性能和功耗的关键之一。然而,目前寻找最优线程数的算法通常是从单一固定起点开始搜索,往往会造成搜索精度低、搜索开销大的问题。最优线程数的分布和位置与多种因素有关,包括程序所属类型、优化目标(性能、功耗和EDP(Energy-delay Product))、并行的多线程区域、软硬件配置参数等。围绕能效优先的最优线程数搜索问题,提出了能效优先的特定起点分类最优线程数搜索算法(Energy-Efficiency-First Optimal Thread Number Search Algorithm based on Specific Starting Point Classification,简称TS^(3)方法)”,通过设计基于程序分类的特殊起点设定方法来确定搜索起点,并采用启发式算法和二分查找方法搜索最优线程数,提升搜索效率,有效提升了能效优先目标(性能最优、功耗最优、能效EDP最优)下的最优线程数搜索精度并降低了搜索开销。在两个x86和一个ARM平台上用8个benchmark对算法有效性进行了详细实验验证,结果表明,与Baseline相比,TS^(3)方法的性能平均提升0.29%(平台A)、0.17%(平台B)、10.77%(平台C);功耗平均降低2.35%(平台A)、1.87%(平台B)、15.97%(平台C);EDP平均降低6.36%(平台A)、5.07%(平台B)、46.94%(平台C)。在3个平台上,与目前经典搜索方法相比,TS^(3)方法的性能平均提升10.16%,功耗平均降低13.45%,EDP平均降低23.77%;搜索开销平均降低86.8%。Optimal thread number setting is one of the key factors affecting the performance and power consumption of multi-threaded programs.However,current algorithms for finding the optimal number of threads usually start the search from a single fixed point,which cause the problem of low precision and large search overhead.The distribution and location of the optimal number of threads are related to various,factors,including types of programs,optimization objectives(performance,power consumption,and EDP),parallel multi-threaded areas,and software-hardware configuration parameters.This paper focuses on the problem of searching for the optimal number of threads with an emphasis on energy efficiency and proposes an energy-efficiency-first optimal thread number search algorithm based on specific starting point classification(abbreviated as TS^(3) method).By designing a multi-threaded program classifier to optimize the setting of search starting points,and applying heuristic and binary search algorithms to enhance search efficiency,the method effectively improves the accuracy of the optimal number of threads search under energy efficiency priorities(optimal performance,optimal powerconsumption,optimal EDP)and reduces search costs.The effectiveness of the algorithm is experimentally validated using eight benchmarks on two x86 platforms and one ARM platform.Compared to the baseline,the TS^(3) method achieves an average performance improvement of 0.29%(Platform A),0.17%(Platform B),and 10.77%(Platform C);average power consumption reduction of 2.35%(Platform A),1.87%(PlatformB),and 15.97%(Platform C);and average EDP reduction of 6.36%(Platform A),5.07%(Platform B),and 46.94%(Platform C).Across the three platforms,compared to current classical search methods,the TS^(3) method demonstrates an average performance improvement of 10.16%,an average reduction in power consumption of 13.45%,and an average reduction in EDP of 23.77%,the search overhead is reduced by 86.8%.
关 键 词:多线程程序 能效优化 最优线程数 随机森林算法 启发式算法
分 类 号:TP302[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...