Novel Ensemble Modeling Method for Enhancing Subset Diversity Using Clustering Indicator Vector Based on Stacked Autoencoder  被引量:1

在线阅读下载全文

作  者:Yanzhen Wang Xuefeng Yan 

机构地区:[1]Key Laboratory of Advanced Control and Optimization for Chemical Processes of Ministry of Education,East China University of Science and Technology,Shanghai,200237,China

出  处:《Computer Modeling in Engineering & Sciences》2019年第10期123-144,共22页工程与科学中的计算机建模(英文)

基  金:The authors are grateful for the support of National Natural Science Foundation of China(21878081);Fundamental Research Funds for the Central Universities under Grant of China(222201717006);the Program of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017.

摘  要:A single model cannot satisfy the high-precision prediction requirements given the high nonlinearity between variables.By contrast,ensemble models can effectively solve this problem.Three key factors for improving the accuracy of ensemble models are namely the high accuracy of a submodel,the diversity between subsample sets and the optimal ensemble method.This study presents an improved ensemble modeling method to improve the prediction precision and generalization capability of the model.Our proposed method first uses a bagging algorithm to generate multiple subsample sets.Second,an indicator vector is defined to describe these subsample sets.Third,subsample sets are selected on the basis of the results of agglomerative nesting clustering on indicator vectors to maximize the diversity between subsets.Subsequently,these subsample sets are placed in a stacked autoencoder for training.Finally,XGBoost algorithm,rather than the traditional simple average ensemble method,is imported to ensemble the model during modeling.Three machine learning public datasets and atmospheric column dry point dataset from a practical industrial process show that our proposed method demonstrates high precision and improved prediction ability.

关 键 词:ENSEMBLE model deep learning BAGGING stacked autoencoder XGBoost 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象