Model-based methods have recently been shown promising for offline reinforcement learning(RL),which aims at learning good policies from historical data without interacting with the environment.Previous model-based off...
supported by the National Natural Science Foundation of China(Nos.12175321,11975021,and 11675275);the Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDA10010900)。
The Taishan Antineutrino Observatory(TAO)is a satellite experiment of the Jiangmen Underground Neutrino Observatory,located near the Taishan nuclear power plant(NPP).The TAO aims to measure the energy spectrum of reac...
supported by Social Science Research Project of Yichang(ysk24ybkt011).
Background:Surgical Nursing is a main course of nursing specialty and a large course lasting 96 credit hours.In response to the teaching pain points such as the complicated and boring content of the surgical nursing c...
supported in part by the National Natural Science Foundation of China(62176259,62373364);the Key Research and Development Program of Jiangsu Province(BE2022095)。
To alleviate the extrapolation error and instability inherent in Q-function directly learned by off-policy Q-learning(QL-style)on static datasets,this article utilizes the on-policy state-action-reward-state-action(SA...
supported by the National Natural Science Foundation of China(Grant Nos.62341201,62122095,62072472,62172445,62302260,and 62202256);the National Key R&D Program of China(Grant No.2022YFF0604502);the China Postdoctoral Science Foundation(Grant No.2023M731956);a grant from the Guoqiang Institute;Tsinghua University。
Offline reinforcement learning(RL)has gathered increasing attention in recent years,which seeks to learn policies from static datasets without active online exploration.However,the existing offline RL approaches often...
National Natural Science Foundation of China,Grant/Award Number:U19A2083。
Offline reinforcement learning(RL)aims to learn policies entirely from passively collected datasets,making it a data‐driven decision method.One of the main challenges in offline RL is the distribution shift problem,w...
Research on automation and intelligent operation of tunnel boring machine(TBM)is receiving more and more attention,benefiting from the increasing construction data.However,most studies on TBM operations optimization w...
Guangdong Ocean University Undergraduate Teaching Quality and Teaching Reform Project“Integrated English 3 Blended Online and Offline Course”(PX-112024042);Guangdong Ocean University Research Initiation Project(060302162402)。
This paper explores the design,implementation,and evaluation of the Integrated English 3 blended course,which integrates online learning through massive open online courses(MOOCs)and face-to-face classroom instruction...
supported by the National Natural Science Foundation of China(No.52272382);the Aeronautical Science Foundation of China(No.20200017051001);the Fundamental Research Funds for the Central Universities,China。
Non-learning based motion and path planning of an Unmanned Aerial Vehicle(UAV)is faced with low computation efficiency,mapping memory occupation and local optimization problems.This article investigates the challenge ...
National Natural Science Foundation of China(62302014);Key Project of Science Research in Universities of Anhui Province of China(2023AH050492,2023AH050497);Anhui Province Graduate Education Teaching Key Project(2023jyjxggyjY193);Anqing Normal University Undergraduate Education Teaching Key Project(2023aqnujyxm15,2023aqnujyxm12);Anqing Normal University Undergraduate Education Teaching General Project(2023aqnujyxm34)。
The teaching mode of the Computer Composition Principles includes theoretical and practical teaching.At present,there is a problem of inconsistency in the teaching content of the two methods in our school.To this end,...