结合多重假设检验的随机森林长期降水预测方法及应用  

A random forest long-term precipitation prediction method combined with multiple hypothesis testing and its application

在线阅读下载全文

作  者:李梦杰 刘琨 牟海磊 殷兆凯 刘志武 吴迪 梁犁丽 LI Mengjie;LIU Kun;MOU Hailei;YIN Zhaokai;LIU Zhiwu;WU Di;LIANG Lili(Institude of Science and Technology,China Three Gorges Corporation,Bejing 101199,China;China Three Gorges International Corporation,Beijing 101199,China)

机构地区:[1]中国长江三峡集团有限公司科学技术研究院,北京101199 [2]中国三峡国际股份有限公司,北京101199

出  处:《南水北调与水利科技(中英文)》2024年第5期920-926,共7页South-to-North Water Transfers and Water Science & Technology

基  金:中国长江三峡集团有限公司自主科研项目(NBZZ20210055);国家科技基础资源调查专项项目(2021xjkk0405)。

摘  要:为解决随机森林方法经验性选取预测因子时存在的错误发现率问题,引入多重假设检验领域控制错误发现率的方法对预测因子的筛选进行质量控制,将因子筛选由经验依赖转化为数据依赖,从而提出一种基于多重假设检验的随机森林方法长期降水预测方法。以巴西巴拉那河上游流域为研究区,利用逐月气候系统指数,应用提出的方法对研究区2018-2020年54个雨量站点的逐月降水量进行模拟预测、检验和交叉验证。结果表明:与传统的随机森林方法相比,该方法预报精度更高,对不同站点1-12月的预测平均合格率达到64%,其中6月预测合格率达到84%,表明该方法可以作为流域长期降水预测的有效工具之一。Long-term precipitation prediction refers to forecasting precipitation over a period of more than one month.This is a crucial aspect of integrated water resources management.The accuracy of long-term precipitation predictions is low due to various uncertainties.Traditional long-term precipitation prediction methods are mainly divided into dynamical numerical methods and mathematical statistical methods.Dynamical numerical methods simulate future weather conditions using sea-land thermodynamic models for precipitation prediction.This approach has a clear physical mechanism,but the model calculations are complex.Data-driven mathematical-statistical methods simulate the correlation between precipitation and predictors from a statistical perspective to establish a long-term prediction model.However,research on precipitation prediction based on mathematical statistical methods mainly focuses on improving the model,with relatively little emphasis on how to select the predictors.In fact,the predictors affect the accuracy of model predictions.Therefore,the focus and challenge of precipitation prediction lie in selecting the necessary predictors for modeling from the relevant factors.Random forest,as a flexible,efficient,and easy-to-use machine learning algorithm,has been widely used in hydrological prediction.The random forest method calculates the importance scores of various related factors and then selects predictors for the model based on empirical experience.This process can result in a certain error rate issue with the selected predictors.To address the issue of false discovery rate in the random forest algorithm when selecting key predictors,this study employs the false discovery rate control method in multiple hypothesis testing to ensure quality control in predictor selection.This transformation shifts variable selection from being experience-dependent to becoming datadependent.Finally,the random forest algorithm is used to construct a long-term precipitation prediction model by integrating the selected precipit

关 键 词:随机森林方法 长期降水预测 预测因子筛选 质量控制 多重假设检验 

分 类 号:TV111[水利工程—水文学及水资源]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象