基于SHAP值机器学习的江西暖季暴雨预报因子重要性分析  

Importance Analysis onWarm Season Rainstorm Forecast Factors in Jiangxi Province based on Machine Learning Model of Shapely Values

在线阅读下载全文

作  者:夏侯杰 肖安 Xia Houjie;Xiao An(Jiangxi Meteorological Observatory,Nanchang 330096,China)

机构地区:[1]江西省气象台,江西南昌330096

出  处:《气象与减灾研究》2024年第1期12-23,共12页Meteorology and Disaster Reduction Research

基  金:江西省气象局重点研究项目(编号:JX2020Z04);江西省科技厅重点研发项目(编号:20203BBGL73223);中国气象局创新发展专项(编号:CXFZ2021Z012).

摘  要:机器学习模型(Machine Learning,ML)的不可解释性给其在气象业务中的应用带来了挑战。模型解释和可视化是解决这一问题的有效途径。文中将SHAP值应用于天气预报ML模型解释,研究了江西省暖季暴雨模型的预报因子对预报结果的影响。分别选取2016—2020年、2021—2022年4—9月ECWMF(European Centre for Medium-Range Weather Forecasts)高分辨率数值模式物理量及国家站降水观测数据进行XGBoost建模与模型解释。结果表明,全局重要性排名前4位依次是总降水(重要性42.70%)、850 hPa比湿(重要性11.17%)、925 hPa相对湿度(重要性10.44%)、500 hPa相对湿度(重要性9.16%)。个例分析表明,命中个例中高重要性物理因子在暴雨区的SHAP值较大,漏报(空报)个例在漏报(空报)区域高重要性物理因子的SHAP值偏小(偏大)。SHAP值从全局和局部可定量给出ML模型有物理意义的解释,解释结果与天气学原理和业务经验较一致,有利于ML在气象业务中的深入应用。The inability to understand how Machine Learning(ML)makes its predictions brings great challenges for its application in day-to-day weather forecast operations.Model interpretation and visualization(MIV)is the key to solve this problem.In this paper,the shapely values(SV)were applied to MIV of ML model of warm season rainstorm forecasting,and then the impact of forecast factors on forecast results by the warm season rainstorm forecast model were discussed.The ECWMF(European Centre for Medium-Range Weather Forecasts)high-resolution numerical model output products and precipitation records of national weather stations in Jiangxi province April to September from 2016 to 2022 years were selected for model training and MIV.Results showed that the top four places of global importance were total precipitation with the importance of 42.70%,the specific humidity of 850 hPa with the importance of 11.17%,the relative humidity at 925 hPa with the importance of 10.44%,and the relative humidity at 500 hPa with the importance of 9.16%,respectively.The application of SV to the forecasting of weather cases showed that the SV of the high importance physical fac-tors in the rainstorm area were larger in the hit cases,but the SV of these factors with high importance in the miss report(false alarm re-port)area were smaller(larger)in the miss(false alarm)cases.It indicated that the SV were enabled to explain the ML model in quantitatively from global understanding to local explanations of each prediction.The explanations made by SV were consistent with physical rules and weather forecast experience which benefited the development of ML in weather-forecasting sciences.

关 键 词:SHAP值 机器学习 暴雨 因子重要性 可解释性 

分 类 号:P457.6[天文地球—大气科学及气象学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象