云南高校图书馆联盟文献共享服务平台- REINFORCEMENT

REINFORCEMENT_LEARNING: 作品数：963被引量：1625H指数：17; 导出分析报告; 相关作者：孟巧荣廉自生辛英尹晓虎王长缨更多>>; 相关机构：北京工业大学电子科技大学沈阳理工大学太原理工大学更多>>; 相关期刊：更多>>; 相关基金：国家自然科学基金北京市自然科学基金中国博士后科学基金广东省自然科学基金更多>>

Clustered Reinforcement Learning: 《Frontiers of Computer Science》2025年第4期43-57,共15页Xiao MA Shen-Yi ZHAO Zhao-Heng YIN Wu-Jun LI; supported by the National Natural Science Foundation of China(Gtant No.62192783);Fundamental Research Funds for the Central Universities(No.020214380108).; Exploration strategy design is a challenging problem in reinforcement learning(RL),especially when the environment contains a large state space or sparse rewards.During exploration,the agent tries to discover unexplor...; 关键词：deep reinforcement learning EXPLORATION count-based method CLUSTERING K-MEANS

Offline model-based reinforcement learning with causal structured world models: 《Frontiers of Computer Science》2025年第4期77-90,共14页Zhengmao ZHU Honglong TIAN Xionghui CHEN Kun ZHANG Yang YU; Model-based methods have recently been shown promising for offline reinforcement learning(RL),which aims at learning good policies from historical data without interacting with the environment.Previous model-based off...; 关键词：reinforcement learning offline reinforcement learning model-based reinforcement learning causal discovery

Behaviour-diverse automatic penetration testing:a coverage-based deep reinforcement learning approach: 《Frontiers of Computer Science》2025年第3期15-24,共10页Yizhou YANG Longde CHEN Sha LIU Lanning WANG Haohuan FU Xin LIU Zuoning CHEN; supported by te Key Research Project of Zhejiang Lab(No.2021PB0AV02)。; Reinforcement Learning(RL)is gaining importance in automating penetration testing as it reduces human effort and increases reliability.Nonetheless,given the rapidly expanding scale of modern network infrastructure,the...; 关键词：network security penetration testing reinforcement learning artificial intelligence

Priority-Aware Resource Allocation for VNF Deployment in Service Function Chains Based on Graph Reinforcement Learning: 《Computers, Materials & Continua》2025年第5期1649-1665,共17页Seyha Ros Seungwoo Kang Taikuong Iv Inseok Song Prohim Tam Seokhoon Kim; supported by Institute of Information&Communications Technology Planning and Evaluation(IITP)grant funded by the Korean government(MSIT)(No.RS-2022-00167197,Development of Intelligent 5G/6G Infrastructure Technology for the Smart City);in part by the National Research Foundation of Korea(NRF),Ministry of Education,through the Basic Science Research Program under Grant NRF-2020R1I1A3066543;in part by BK21 FOUR(Fostering Outstanding Universities for Research)under Grant 5199990914048;in part by the Soonchunhyang University Research Fund.; Recently,Network Functions Virtualization(NFV)has become a critical resource for optimizing capability utilization in the 5G/B5G era.NFV decomposes the network resource paradigm,demonstrating the efficient utilization...; 关键词：Deep reinforcement learning graph neural network multi-access edge computing network functions virtualization software-defined networking

Cyclical Training Framework with Graph Feature Optimization for Knowledge Graph Reasoning: 《Computers, Materials & Continua》2025年第5期1951-1971,共21页Xiaotong Han Yunqi Jiang Haitao Wang Yuan Tian; supported by the National Key Research and Development Program of China(No.2023YFF0905400);the National Natural Science Foundation of China(No.U2341229).; Knowledge graphs(KGs),which organize real-world knowledge in triples,often suffer from issues of incompleteness.To address this,multi-hop knowledge graph reasoning(KGR)methods have been proposed for interpretable know...; 关键词：Knowledge graph reinforcement learning TRANSFORMER

Obstacle Avoidance Path Planning for Delta Robots Based on Digital Twin and Deep Reinforcement Learning: 《Computers, Materials & Continua》2025年第5期1987-2001,共15页Hongxiao Wang Hongshen Liu Dingsen Zhang Ziye Zhang Yonghui Yue Jie Chen; supported in part by the National Natural Science Foundation of China under Grants 62303098 and 62173073;in part by China Postdoctoral Science Foundation under Grant 2022M720679;in part by the Central University Basic Research Fund of China under Grant N2304021;in part by the Liaoning Provincial Science and Technology Plan Project-Technology Innovation Guidance of the Science and Technology Department under Grant 2023JH1/10400011.; Despite its immense potential,the application of digital twin technology in real industrial scenarios still faces numerous challenges.This study focuses on industrial assembly lines in sectors such as microelectronics...; 关键词：Digital twin deep reinforcement learning delta robot obstacle path planning

Active Object Detection Based on PPO Learning Algorithm with Decision Knowledge Guidance: 《Machine Intelligence Research》2025年第2期386-396,共11页Fujing Yao Guohui Tian Yuhao Wang Ning Yang; supported in part by the National Natural Science Foundation of China(Nos.62273203 and U1813215);in part by the Special Fund for the Taishan Scholars Program of Shandong Province,China(No.ts2015110005).; After detecting a target object,a service robot must approach the target object to perform the associated service task.In active object detection(AOD)tasks,effective feature information representation and comprehensiv...; 关键词：Service robot active object detection reinforcement learning path experience comprehensive decision model

A novel trajectories optimizing method for dynamic soaring based on deep reinforcement learning: 《Defence Technology(防务技术)》2025年第4期99-108,共10页Wanyong Zou Ni Li Fengcheng An Kaibo Wang Changyin Dong; support received by the National Natural Science Foundation of China(Grant Nos.52372398&62003272).; Dynamic soaring,inspired by the wind-riding flight of birds such as albatrosses,is a biomimetic technique which leverages wind fields to enhance the endurance of unmanned aerial vehicles(UAVs).Achieving a precise soar...; 关键词：Dynamic soaring Differential flatness Trajectory optimization Proximal policy optimization

Key Mechanisms on Resource Optimization Allocation in Minority Game Based on Reinforcement Learning: 《Tsinghua Science and Technology》2025年第2期721-731,共11页Changyan Di Tianyi Wang Qingguo Zhou Jinqiang Wang; partially supported by the National Natural Science Foundation of China(Nos.U22A20261 and 61402210);the National Key R&D Program of China(No.2020YFC0832500);the Gansu Province Science and Technology Major Project-Industrial Project(No.22ZD6GA048);the Gansu Province Key Research and Development Plan-Industrial Project(No.22YF7GA004);the Fundamental Research Funds for the Central Universities(Nos.lzujbky-2022-kb12,lzujbky-2021-sp43,lzujbky-2020-sp02,lzujbky-2019-kb51,and lzujbky-2018-k12);the Science and Technology Plan of Qinghai Province(No.2020-GX-164);the Supercomputing Center of Lanzhou University,and the Gansu Provincial Science and Technology Major Special Innovation Consortium Project(No.21ZD3GA002)。; The emergence of coordinated and consistent macro behavior among self-interested individuals competing for limited resources represents a central inquiry in comprehending market mechanisms and collective behavior.Trad...; 关键词：minority game optimization of resource allocation multi-agent system reinforcement learning

A Low-Collision and Efficient Grasping Method for Manipulator Based on Safe Reinforcement Learning: 《Computers, Materials & Continua》2025年第4期1257-1273,共17页Qinglei Zhang Bai Hu Jiyun Qin Jianguo Duan Ying Zhou; Grasping is one of the most fundamental operations in modern robotics applications.While deep rein-forcement learning(DRL)has demonstrated strong potential in robotics,there is too much emphasis on maximizing the cumu...; 关键词：Safe reinforcement learning(Safe RL) manipulator grasping obstacle avoidance constraints lagrange multiplier dynamic weighted

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

REINFORCEMENT_LEARNING

检索结果分析

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

REINFORCEMENT_LEARNING

检索结果分析

下载全文

用户登录

高级检索检索式检索