基于实例的强化学习课程教改探索  

Exploration on Teaching Reform of Case Based on Reinforcement Learning Course

在线阅读下载全文

作  者:严瑞东 高洪波 Yan Ruidong;Gao Hongbo(School of Traffic and Transportation,Beijing Jiaotong University,Beijing,100091,China;Department of Automation,University of Science and Technology of China,Hefei,230026,China)

机构地区:[1]北京交通大学交通运输学院,北京100091 [2]中国科技大学自动化系,安徽合肥230026

出  处:《中国现代教育装备》2023年第11期126-128,共3页China Modern Educational Equipment

摘  要:分析强化学习课程教学中存在的问题,将“路径寻优”案例引入课程教学,探索以一个案例串联强化学习核心算法的教学方法。首先,基于“路径寻优”案例构建马尔可夫决策过程模型;其次,阐述动态规划方法、蒙特卡洛方法、时序差分方法的原理及区别;最后,结合“路径寻优”案例,通过关联编程,更加直观地讲解强化学习核心算法的区别,提高强化学习课程的教学质量。We analyze the problems in the teaching of reinforcement learning course.By introducing the case of"path optimization"into practical teaching,the teaching method of explaining the key algorithm of reinforcement learning with the same case is explored.Firstly,a type of Markov decision process model based on the case of"path optimization"is built.Then,the principle and difference of dynamic programming,monte carlo and temporal difference methods are described.Finally,by using the case of"path optimization",the theoretical differences of the key algorithm of reinforcement learning are explained more intuitively through association programming,thus improving the quality of reinforcement learning course teaching.

关 键 词:强化学习 路径寻优 实践教学 

分 类 号:G642.0[文化科学—高等教育学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象