检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Yuxi HAN Dequan LI Yang YANG
出 处:《Frontiers of Information Technology & Electronic Engineering》2025年第3期385-399,共15页信息与电子工程前沿(英文版)
基 金:Project supported by the Academic and Technical Leaders and Backup Candidates Program of Anhui Province,China(No.2019h211);the Natural Science Foundation of Anhui Province,China(No.2208085ME128)。
摘 要:Deep reinforcement learning has shown remarkable capabilities in visual tasks,but it does not have a good generalization ability in the context of interference signals in the input images;this approach is therefore hard to be applied to trained agents in a new environment.To enable agents to distinguish between noise signals and important pixels in images,data augmentation techniques and the establishment of auxiliary networks are proven effective solutions.We introduce a novel algorithm,namely,saliency-extracted Q-value by augmentation(SEQA),which encourages the agent to explore unknown states more comprehensively and focus its attention on important information.Specifically,SEQA masks out interfering features and extracts salient features and then updates the mask decoder network with critic losses to encourage the agent to focus on important features and make correct decisions.We evaluate our algorithm on the DeepMind Control generalization benchmark(DMControl-GB),and the experimental results show that our algorithm greatly improves training efficiency and stability.Meanwhile,our algorithm is superior to state-of-the-art reinforcement learning methods in terms of sample efficiency and generalization in most DMControl-GB tasks.
关 键 词:Deep reinforcement learning Visual tasks GENERALIZATION Data augmentation SIGNIFICANCE DeepMind Control generalization benchmark
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49