基于GAN的小样本腐蚀失厚率数据增强方法  

Corrosion Thickness Loss Rate Data Enhancement Based on a Small Sample of GAN

在线阅读下载全文

作  者:周俊炎 王竟成 杨小奎[1] 舒畅[1] 王津梅[1] 张宸 ZHOU Jun-yan;WANG Jing-cheng;YANG Xiao-kui;SHU Chang;WANG Jin-mei;ZHANG Chen(Southwest Institute of Technology and Engineering,Chongqing 400039,China)

机构地区:[1]西南技术工程研究所,重庆400039

出  处:《装备环境工程》2023年第1期142-150,共9页Equipment Environmental Engineering

摘  要:目的 对小样本腐蚀失厚率数据进行数据增强,实现数据扩充,以提升后续分析模型的预测精度,减轻过拟合程度,并提升模型的泛化能力。方法 利用生成对抗网络(Generative Adversarial Networks,GAN)扩充腐蚀失厚率数据,使数据分布更加全面。对生成数据进行降维可视化分析,探究生成数据与原始数据样本的分布规律,分析数据增强合理性,并从多个算法模型、多个评价指标角度对分析预测能力、泛化能力进行评估。结果 生成数据填补了原始数据在样本空间分布的薄弱环节,加入生成数据后,各机器学习算法模型得出的MSE均值为未加入生成数据的61.72%~91.74%,皮尔逊均值为99.01%~113.64%,预测准确度提升,结果关联性更强,模型泛化能力增强。结论 GAN能有效对小样本腐蚀失厚率数据进行增强,数据扩充对分析预测提供正向支持,生成数据不宜多于原始数据,防止扰乱训练样本分布,同时存在生成数据多样性受限的问题。The work aims to conduct data enhancement on the corrosion thickness loss rate of small samples to achieve data expansion, improve the prediction accuracy of the subsequent analysis model, reduce the degree of overfitting and improve the generalization ability of the model. The Generative Adversarial Network(GAN) was used to expand the corrosion thickness loss rate data and make the data distribution more comprehensive. Dimensionality reduction visual analysis on the generated data was conducted. The distribution of generated data and original data samples was explored. The rationality of data enhancement was analyzed. In addition, the analysis and prediction ability and generalization ability were evaluated from the perspectives of multiple algorithm models and multiple evaluation indicators. The generated data filled in the weak link of the original data in the sample space distribution. After adding the generated data, the average MSE obtained by each machine learning algorithm model was 61.72% to 91.74% of the result without the generated data, and the Pearson average was 99.01% to 113.64 %. The prediction accuracy was improved. The results were more relevant. And the model generalization ability was enhanced. GAN can effectively enhance the corrosion thickness loss rate data of small samples. Data expansion provides positive support for analysis and prediction. The generated data should not be more than the original data to prevent disturbing the distribution of training samples. At the same time, there are problems with limited diversity of generated data.

关 键 词:腐蚀失厚率 小样本 生成对抗网络 数据增强 降维分析 样本分布 

分 类 号:TP399[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象