融合时空特征的视觉自动驾驶强化学习算法  被引量:1

Reinforcement Learning Algorithm for Visual Auto-driving Based Space-time Features

在线阅读下载全文

作  者:杨蕾 雷为民 张伟 YANG Lei;LEI Wei-min;ZHANG Wei(School of Computer Science and Engineering,Northeastern University,Shenyang 110819,China;DAMO Academy,Alibaba Group,Hangzhou 310000,China)

机构地区:[1]东北大学计算机科学与工程学院,沈阳110819 [2]阿里巴巴集团达摩院自动驾驶实验室,杭州310000

出  处:《小型微型计算机系统》2023年第2期356-362,共7页Journal of Chinese Computer Systems

基  金:中央高校基本科研业务专项资金项目(N2216010)资助;国家重点研发计划项目(2018YFB1702000)资助.

摘  要:基于视觉的自动驾驶任务挑战主要来自环境信息维度高和训练数据分布偏差大2个方面.针对环境信息维度高的挑战,融合时空特征的视觉自动驾驶算法(Space-Time Reinforce Learning Auto Driving,简称STRLAD)使用双流网络络进行特征提取,包含(ⅰ)感知网络:从摄像头中低速抽取RGB图片作为输入,完成图片整体特征提取;(ⅱ)运动网络:从视频中高速获取灰度图作为输入,完成物体运动特征提取;(ⅲ)感知网络和运动网络在各个特征层使用注意力机制进行融合,完成对环境的特征表示.针对训练数据分布偏差的问题,STRLAD算法以双流网络提取的特征为输入,使用Soft Actor-Critic算法学习驾驶策略,缓解数据偏差和泛化问题.STRLAD算法使用CARLA模拟器进行训练和验证,实验结果表明STRLAD算法能够在复杂的城市尤其多动态物体的环境中能够完成自动驾驶,完成率达到89%.The challenges of vision-based autonomous driving algorithms mainly come from two aspects:high dimensionality of environmental information and large bias in training data distribution.To overcome the challenge of high environmental information dimension,the Space-Time Reinforce Learning Auto Driving(STRLAD)algorithm uses a dual-stream network for feature extraction,including(ⅰ)a perception network,which extracts RGB images from the camera at low speed as input,and completes the semantic feature extraction of the image as a whole.(ⅱ)motion network:grayscale images are obtained from the video at high speed as input to complete the object motion feature extraction.(ⅲ)the perception network and motion network use attention mechanism for fusion in each feature layer to complete the feature representation of the vehicle environment.To address the problem of biased training data distribution,the STRLAD algorithm uses the features extracted from the dual-stream network as input and uses the SAC(Soft Actor-Critic)reinforcement algorithm to learn a stochastic policy to alleviate the problems of data bias and generalization.is able to perform autonomous driving in complex urban environments,especially with multiple dynamic objects,with a completion rate of 89%.

关 键 词:深度强化学习 计算机视觉 自动驾驶 深度神经网络 人工智能 

分 类 号:TP389[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象