检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:高乐天 顾文波 Gao Letian;Gu Wenbo(College of Electrical Engineering,Xinjiang University,Urumqi 830017,China)
出 处:《太阳能学报》2025年第4期256-262,共7页Acta Energiae Solaris Sinica
基 金:新疆维吾尔自治区自然科学基金(2022D01C87);中央引导地方科技发展专项资金(ZYYD2022C16)。
摘 要:为避免光伏组件寿命、清洁度等随时间变化但数据集中不存在的特征对光伏发电功率预测造成的不良影响,提出一种基于随机森林重要性排序与多项式升维的数据挖掘方法来应用于小样本的光伏发电功率预测中。首先根据随机森林重要性对各特征进行重要性排序;然后通过交叉验证分别确定回归模型最适合保留的特征数量和多项式升维次数;最后对比数据挖掘前后交叉验证集和测试集的预测结果。结果表明所提出的数据挖掘方法适用于小样本条件下MLPR回归模型及以MLPR为基础的RNN、GRU、LSTM共3种时序回归模型。To mitigate the negative impacts on photovoltaic power generation predictions from time-varying and non-existent features within datasets,such as the lifespan and cleanliness of photovoltaic panels within datasets,this paper introduces a data mining method tailored for small sample sizes that leverages random forest feature importance ranking and polynomial feature expansion.Initially,the method ranks features according to their importance as determined by the random forest algorithm.Subsequently,it identifies the optimal number of features to retain and determines the most appropriate degree of polynomial feature expansion for the regression model.The method then compares prediction results for both the cross-validation and test sets before and after applying dimensionality enhancement.The findings indicate that the data mining approach proposed in this study significantly enhances the prediction accuracy of the multilayer perceptron regressor(MLPR)model and three time series regression models derived from MLPR:recurrent neural network(RNN),gated recurrent unit(GRU),and long short-term memory(LSTM)networks,particularly in scenarios involving small sample sizes.
关 键 词:数据挖掘 光伏发电 预测 小样本 随机森林重要性排序 多项式升维 交叉验证
分 类 号:TK513.5[动力工程及工程热物理—热能工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7