基于OpenCL的尺度不变特征变换算法的并行设计与实现  被引量:3

Parallel design and implementation of scale invariant feature transform algorithm based on Open CL

在线阅读下载全文

作  者:许川佩[1,2] 王光[1,2] 

机构地区:[1]桂林电子科技大学电子工程与自动化学院,广西桂林541004 [2]广西自动检测技术与仪器重点实验室(桂林电子科技大学),广西桂林541004

出  处:《计算机应用》2016年第7期1801-1806,共6页journal of Computer Applications

摘  要:针对尺度不变特征变换(SIFT)算法实时性差的问题,提出了利用开放式计算语言(Open CL)并行优化的SIFT算法。首先,通过对原算法各步骤进行组合拆分、重构特征点在内存中的数据索引等方式对原算法进行并行化重构,使得算法的中间计算结果能够完全在显存中完成交互;然后,采用复用全局内存对象、共享局部内存、优化内存读取等策略对原算法各步骤进行并行设计,提高数据读取效率,降低传输延时;最后,利用Open CL语言在图形处理单元(GPU)上实现了SIFT算法的细粒度并行加速,并在中央处理器(CPU)上完成了移植。与原SIFT算法配准效果相近时,并行化的算法在GPU和CPU平台上特征提取速度分别提升了10.51~19.33和2.34~4.74倍。实验结果表明,利用Open CL并行加速的SIFT算法能够有效提高图像配准的实时性,并能克服统一计算设备架构(CUDA)因移植困难而不能充分利用异构系统中多种计算核心的缺点。The real-time performance of Scale Invariant Feature Transform( SIFT) algorithm is excessively bad. To solve the problem, a parallel optimized SIFT algorithm using the Open Computing Language( Open CL) was proposed. Firstly, all steps of the original algorithm were split and combined; in addition, the indexing method of feature points in memory was restructured. Thus the middle calculation results could be made completely to finish interaction in the memory. Then, each step of the original algorithm was designed in parallel to improve the efficiency of data reading and reduce the transmission delay by multiplexing global memory object, sharing local memory and optimizing memory access. Finally, a fine-grained parallel accelerated SIFT algorithm was completed on Graphics Processing Unit( GPU) platform using Open CL and the transplant was completed on the Central Processing Unit( CPU) platform. The parallel algorithm speeded up 10. 51- 19. 33 and 2. 34- 4. 74 times in feature extraction on GPU and CPU platform when the registration result was close to the original algorithm. The experimental results show that the parallel accelerated SIFT algorithm using Open CL can improve the real-time performance of image registration and overcome the disadvantages of that Compute Unified Device Architecture( CUDA) is difficult to be transplanted so that it can not make full use of the multiple computing cores in heterogeneous systems.

关 键 词:尺度不变特征变换算法 开放式计算语言 复用内存对象 细粒度并行 异构系统 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象