基于GPU的特征脸算法优化研究  

Optimization of GPU-based Eigenface Algorithm

在线阅读下载全文

作  者:李繁 严星[2] 张晓宇 LI Fan;YAN Xing;ZHANG Xiao-yu(Network&Experimental Teaching Center,Xinjiang University of Finance and Economics,Urumqi 830012,China;School of Information Management,Xinjiang University of Finance and Economics,Urumqi 830012,China)

机构地区:[1]新疆财经大学网络与实验教学中心,乌鲁木齐830012 [2]新疆财经大学信息管理学院,乌鲁木齐830012

出  处:《计算机科学》2021年第4期197-204,共8页Computer Science

基  金:国家自然科学基金(41830101);新疆社科基金(17BTQ093);新疆财经大学青年博士基金(2015BS003)。

摘  要:特征脸算法是基于脸部表征的常用人脸辨识方法之一。当训练数据量较大时,不管是训练还是测试模块都非常耗时。基于此,采用CUDA并行运算架构实现GPU加速特征脸算法。针对GPU并行运算的效果取决于硬件规格、算法本身的复杂度和可并行性,以及程序开发者使用GPU的并行化方式等因素,文中首先提出在特征脸算法训练阶段的计算平均值、zero mean、正规化特征脸等计算步骤以及测试阶段的投影到特征脸空间、计算欧几里得距离等计算步骤使用GPU优化加速;其次在相应计算步骤采用不同的并行化加速方法并做出效能评估。实验结果表明,在人脸训练数据量在320~1920的范围内,各计算步骤加速效果明显。与Intel i7-5960X相比,GTX1060显示适配器在训练模块中可达到平均约71.7倍的加速效果,在测试模块中可达到平均约34.1倍的加速效果。Eigenface algorithm is one of the commonly used face recognition methods based on facial representation.When the amount of training data is large,it is very time-consuming both training and testing modules.Based on this,the CUDA parallel computing architecture is used to implement GPU accelerated eigenface algorithm.The effect of GPU parallel computing depends on the hardware specifications,the complexity and parallelism of the algorithm itself,and the parallelization method used by the program developer to use GPU.Therefore,this paper first proposes the calculation of the average value and zero mean in the training phase of the eigenface algorithm.The calculation steps such as normalizing the eigenface and the calculation steps of the projection to the eigenface space and calculating the Euclidean distance in the test phase are optimized and accelerated by GPU.Secondly,different parallelization acceleration methods are used in the corresponding calculation steps and performance evaluation is made.Experimental results show that in the range of face training data from 320 to 1920,the acceleration effect of each calculation step is obvious.Compared with Intel i7-5960X,the GTX1060 display adapter can achieve an average acceleration effect of about 71.7 times in the training module,and an average acceleration effect of about 34.1 times in the test module.

关 键 词:人脸辨识 特征脸 GPU并行运算 旋转运算 核心函数 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象