用于高维小样本特征选择的超网络设计  

Hypernetwork design for feature selection of high-dimensional small samples

在线阅读下载全文

作  者:魏俊伊 董红斌[1] 余紫康 WEI Junyi;DONG Hongbin;YU Zikang(College of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China)

机构地区:[1]哈尔滨工程大学计算机科学与技术学院,黑龙江哈尔滨150001

出  处:《智能系统学报》2025年第2期465-474,共10页CAAI Transactions on Intelligent Systems

基  金:黑龙江自然科学基金项目(LH2020F023)。

摘  要:特征选择是受各行业广泛关注的问题。特征选择针对的数据集通常是高维的,且样本数较少,例如生物、医学领域的数据集。虽然很多的正则化网络在这种数据集上的表现能够优于复杂的网络,但是在小数据量上许多潜在的特征关系仍然会被过度挖掘,从而出现过拟合的情况。为了解决此类问题,提出了端到端的稀疏重构网络,模型先对特征进行稀有增强和奇异值嵌入,之后通过并行辅助网络对嵌入矩阵进行训练,重构预测权重,实现了削减参数的超网络学习方式。参数较少的网络受过拟合的影响也会随之减少,有效降低了无效参数对网络的影响。对生物、医学领域的12种高维小样本数据集进行了实验,并通过对比实验发现在8种特征选择网络中降维后,本网络的分类准确率平均提升了3.26百分点。另外,通过消融实验分别证明了分解层、重构层、关联层的作用,最后分析权重结果,进一步阐述了模型的扩展应用。Feature selection is a widely recognized challenge across various industries.They typically target high-dimensional datasets with fewer samples,such as those in biology and medicine field.Many regularization networks outperform complex network structures on such datasets.However,numerous underlying feature relationships can still be overfitted,particularly with limited data.This study proposes an end-to-end sparse reconstruction network to address this issue.First,the model enhances features through sparsity and singular value embedding.Then,it trains the embedding matrix through a parallel auxiliary network to reconstruct prediction weights,which implements a parameter-reducing super-network learning approach.This approach reduces the impact of overfitting on networks with fewer parameters,which effectively mitigates the influence of ineffective parameters on the network.Experiments conducted on 12 high-dimensional small-sample datasets in biology and medicine field reveal an average improvement of 3.26 percentage point in classification accuracy after dimensionality reduction in eight feature selection networks.Furthermore,the roles of the disintegration layer,reconstruction,and correlation layer are separately validated through ablation experiments,followed by weight result analysis,which further elucidates the extended applications of the model.

关 键 词:特征选择 正则化网络 过拟合 端到端 稀疏重构 奇异值 辅助网络 超网络 高维小样本 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象