一种基于TVM的自动调度搜索优化方法  

Automatic Scheduling Search Optimization Method Based on TVM

在线阅读下载全文

作  者:韩林 王一帆[2] 李嘉楠 高伟[1] HAN Lin;WANG Yifan;LI Jianan;GAO Wei(National Supercomputing Center in Zhengzhou,Zhengzhou University,Zhengzhou 450001,China;School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,China)

机构地区:[1]郑州大学国家超级计算郑州中心,郑州450001 [2]郑州大学计算机与人工智能学院,郑州450001

出  处:《计算机科学》2025年第3期268-276,共9页Computer Science

基  金:河南省重大科技专项(221100210600)。

摘  要:随着人工智能的迅猛发展,新型算子与硬件不断涌现,算子库的开发和维护面临着巨大的挑战,仅仅依靠手工优化已无法满足AI模型性能提升的需求。Ansor是一种基于TVM的算子自动调度技术,可以针对不同的后端搜索深度学习模型或算子的最佳调度方案,生成高性能代码而无需用户手动定义模板,但其巨大的搜索空间造成了搜索效率低下的问题。因此,提出了两种优化方案:1)基于强化学习的算法实现最佳性能草图的选择;2)基于机器学习模型的突变规则预测。两种优化方案旨在缩短最佳调度方案的搜索时间,快速生成高性能的算子。为评估优化方案的有效性,对Resnet-50等3种模型和conv2d等3种算子进行测试与评估。结果显示,优化后的Ansor只用70%~75%的搜索时间就可以生成性能与之前相同甚至更优的目标程序,并且在最佳迭代次数下,目标程序的推理速度最高可提升5%。With the rapid development of artificial intelligence and the continuous emergence of new operators and hardware,the development and maintenance of operator libraries face enormous challenges.Relying solely on manual optimization can no longer meet the needs of improving AI model performance.Ansor is an operator automatic scheduling technique based on TVM,which can search for the best scheduling schemes for different backend deep learning models or operators,generate high-performance code without the need for users to manually define templates.However,the huge search space results in low search efficiency.Therefore,two optimization schemes are proposed.One is to select the optimal performance sketch based on Reinforcement lear-ning algorithm,and the other is to predict mutation rules based on machine learning models.Two optimization schemes aim to reduce the search time for the optimal scheduling scheme and quickly generate high-performance operators.To evaluate the effectiveness of the optimization plan,three models such as Resnet-50 and three operators such as conv2d are tested and evaluated.The results show that the optimized Ansor can generate target programs with the same or even better performance as before in only 70%~75% search time.Moreover,under the optimal iteration number,the inference speed of the target program can be improved by up to 5%.

关 键 词:自动调度 TVM编译器 搜索速度优化 机器学习 强化学习 深度学习模型 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象