基于SAC模型的改进遗传算法求解TSP问题被引量：15

SAC Model Based Improved Genetic Algorithm for Solving TSP

作　　者：陈斌[1] 刘卫国[2] CHEN Bin;LIU Weiguo(School of Automation,Central South University,Changsha 410083,China;School of Computer Science and Engineering,Central South University,Changsha 410083,China)

机构地区：[1]中南大学自动化学院,长沙410083 [2]中南大学计算机学院,长沙410083

出　　处：《计算机科学与探索》2021年第9期1680-1693,共14页Journal of Frontiers of Computer Science and Technology

基　　金：国家自然科学基金(61073187)。

摘　　要：遗传算法(GA)的全局搜索能力强,易于操作,但其收敛速度慢,易陷入局部最优值。针对以上问题,利用深度强化学习模型SAC对遗传算法进行改进,并将其应用至旅行商问题(TSP)的求解。改进算法将种群作为与智能体(agent)交互的环境,引入贪心算法对环境进行初始化,使用改进后的交叉与变异运算作为agent的动作空间,将种群的进化过程视为一个整体,以最大化种群进化过程的累计奖励为目标,结合当前种群个体适应度情况,采用基于SAC的策略梯度算法,生成控制种群进化的动作策略,合理运用遗传算法的全局和局部搜索能力,优化种群的进化过程,平衡种群收敛速度与遗传操作次数之间的关系。对TSPLIB实例的实验结果表明,改进的遗传算法可有效地避免陷入局部最优解,在提高种群收敛速度的同时,减少寻优过程的迭代次数。Genetic algorithm(GA)has strong global searching ability and is easy to operate,but its disadvantages such as poor convergence speed,unstable and easy to fall into local optimal value restrict its application.In order to overcome these disadvantages,an improved genetic algorithm based on the deep reinforcement learning model SAC(soft actor-critic)is proposed in this paper,which is applied to the resolution of traveling salesman problem(TSP).The improved algorithm regards the population as agent.s interaction environment,meanwhile greedy algorithm is used to initialize this environment for improving the quality of initial populations.For controlling the evolution of the population,the improved crossover and mutation operations are used as agent.s action space.With the goal of maximizing the cumulative rewards of population evolution,the improved algorithm treats the evolution of the population as a whole and uses a policy gradient algorithm based on SAC to generate evolution controlling action strategy combined with the current individual fitness of the population.The action strategy reasonably uses the global and local search ability of genetic algorithm by agent.s actions,optimizing the evolutionary process of the population while balancing relationship between the population convergence rate and the times of genetic operation.The experimental results of TSPLIB indicate that the improved genetic algorithm can effectively avoid falling into the local optimal solution and reduce the number of iteration in the optimization process while improving the convergence rate of the population.

关键词：强化学习遗传算法(GA) 旅行商问题(TSP) 深度策略梯度 soft actor-critic(SAC)模型

分类号：TP18[自动化与计算机技术—控制理论与控制工程] TP301.6[自动化与计算机技术—控制科学与工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于SAC模型的改进遗传算法求解TSP问题被引量：15

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于SAC模型的改进遗传算法求解TSP问题 被引量：15

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于SAC模型的改进遗传算法求解TSP问题被引量：15