面向制造过程数据的两阶段无监督特征选择方法被引量：5

Two-stage Unsupervised Feature Selection Method Oriented to Manufacturing Procedural Data

作　　者：张洁[1,2] 盛夏张朋秦威[1] 赵新明[1] ZHANG Jie;SHENG Xia;ZHANG Peng;QIN Wei;ZHAO Xinming(Institute of Intelligent Manufacturing and Information Engineering,Shanghai Jiao Tong University, Shanghai 200240;College of Mechanical Engineering, Donghua University, Shanghai 201620)

机构地区：[1]上海交通大学机械与动力工程学院,上海200240 [2]东华大学机械工程学院,上海201620

出　　处：《机械工程学报》2019年第17期133-144,共12页Journal of Mechanical Engineering

基　　金：国家自然科学基金资助项目(U1537110,51435009)

摘　　要：现代化制造车间无时无刻不在产生大量数据,其中绝大部分以无标签结构化原始数据的形式存储在现代化制造企业的工业大数据平台中。这些制造数据一方面具有很大的潜在价值,另一方面因为其具有高噪声、高冗余性的特点,难以直接分析与利用。因此,针对制造过程原始数据的特点,以去除制造数据冗余性、挖掘原始数据局部结构为目的,提出一种两阶段无监督特征选择方法。该方法的第一阶段采用遗传算法产生的原始数据的低维子集作为径向基神经网络(Radial basis fuctionneural network, RBFNN)的输入,利用RBFNN复现原始数据的全部维度,并以降维率及复现精度作为遗传算法(Geneticalgorithm, GA)的适应度函数,通过GA多次迭代学习高维特征的低维表示,删除原始数据集中的冗余特征与噪声特征。第二阶段采用拉普拉斯特征得分(Laplacian score, LS)逐维评价剩余特征对于反映数据局部几何结构的作用,挖掘对改善分类性能更有效的特征。通过与LS等无监督特征选择算法对比,验证了提出的两阶段无监督特征选择方法能够有效降低制造数据的冗余性,并提高数据的分类性能。In a modernized manufacturing workshop, myriads of data are incessantly produced and a large part of those are stored in the industrial big data platform of the modem manufacturing enterprise in the form of structuralized unlabeled raw data. Those manufacturing data are of great latent exploitative value, because of their characteristics of high-noise and high-redundancy, however, direct analysis and utilization of them are impossible. Aiming at reducing the redundancy of manufacturing procedural data and excavating their local structure, a two-stage unsupervised feature selection method is proposed. In the first stage of the method, subset of the original feature set generated by genetic algorithm(GA) is utilized as the input features of radius basis function neural network(RBFNN), to reconstruct the unabridged original feature set. The ratio of dimensionality reduction and reconstructional accuracy are calculated jointly as the fitness function of GA, which is optimized by iteration to learn a low-dimensional representation of high-dimensional features, removing redundant and noisy features of the origin feature set. In the second stage, Laplacian score(LS) is employed to evaluate the locality preserving power of the remaining features, unearthing features which are prone to improving the performance of classification. By comparing with other unsupervised feature selection method, the method proposed here is proven more effective in reducing the redundancy of manufacturing data and simultaneously enhancing the performance of classification.

关键词：无监督特征选择遗传算法径向基神经网络拉普拉斯得分制造过程数据

分类号：TG156[金属学及工艺—热处理]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向制造过程数据的两阶段无监督特征选择方法被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向制造过程数据的两阶段无监督特征选择方法 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

面向制造过程数据的两阶段无监督特征选择方法被引量：5