一种利用Screening加速技巧的Lasso算法  

Lasso algorithm using Screening accelerated technique

在线阅读下载全文

作  者:邱俊洋 潘志松[1] 易磊[1] 陶蔚[1] 张梁梁 

机构地区:[1]中国人民解放军理工大学指挥信息系统学院,南京210007

出  处:《计算机工程与应用》2018年第4期135-140,共6页Computer Engineering and Applications

基  金:国家自然科学基金(No.61273296);江苏省自然科学基金青年基金面上项目(No.BK20140075)

摘  要:Lasso(Least absolute shrinkage and selection operator)是目前广为应用的一种稀疏特征选择算法。经典的Lasso算法通过对高维数据进行特征选择一定程度上降低了计算开销,然而,求解Lasso问题目前仍面临诸多困难与挑战,例如当特征维数和样本数量非常大时,甚至无法将数据矩阵加载到主存储器中。为了应对这一挑战,Screening加速技巧成为近年来研究的热点。Screening可以在问题优化求解之前将稀疏优化结果中系数必然为0的无效特征筛选出来并剔除,从而极大地降低数据维度,在不损失问题求解精度的前提下,加速稀疏优化问题的求解速度。首先推导了Lasso的对偶问题,根据对偶问题的特性得出基于对偶多面投影的Screening加速技巧,最后将Screening加速技巧引入Lasso特征选择算法,并在多个高维数据集上进行实验,通过加速比、识别率以及算法运行时间三个指标验证了Screening加速技巧在Lasso算法上的良好性能。Recently, Lasso is one of the most popular algorithms for sparse feature selection in the application. Traditional Lasso algorithm reduces the computational overhead to a certain extent through feature selection. However, sometimes it may become impossible to load the data matrix into the main memory of the computer when the feature dimension or the number of samples is extremely large. Thus it can be seen that there are still many difficulties and challenges in solving Lasso algorithm. To address these challenges, the Screening accelerating technique has been proposed and becomes the research focus in recent years. The Screening technique is able to quickly identify and remove the invalid features before the optimization process. Then it only needs to deal with the reduced data, which can greatly accelerate the solving efficiency.The most important is that the Screening technique can guarantee that the discarded features have zero coefficients in the solution vector so as to reduce the data dimension as well as maintain the solution accuracy. This paper firstly derives the dual problem of Lasso, then introduces the Screening accelerated technique according to the properties of the dual problem. Besides, it evaluates the performance of the algorithms under several high dimensional datasets using three metrics,speedup ratio, detection rate and running time. Results show that the Screening accelerated technique is effective in solving efficiency of Lasso.

关 键 词:Lasso算法 Screening加速技巧 稀疏特征选择 高维数据 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象