面向众核处理器的阴阳K-means算法优化  被引量:1

Optimizing Yinyang K-means algorithm on many-core CPUs

在线阅读下载全文

作  者:周天阳 王庆林[1,2] 李荣春 梅松竹[1,2] 尹尚飞 郝若晨[1,2] 刘杰 ZHOU Tianyang;WANG Qinglin;LI Rongchun;MEI Songzhu;YIN Shangfei;HAO Ruochen;LIU Jie(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China;National Key Laboratory of Parallel and Distributed Computing,National University of Defense Technology,Changsha 410073,China)

机构地区:[1]国防科技大学计算机学院,湖南长沙410073 [2]国防科技大学并行与分布计算全国重点实验室,湖南长沙410073

出  处:《国防科技大学学报》2024年第1期93-102,共10页Journal of National University of Defense Technology

基  金:国家自然科学基金资助项目(62002365)。

摘  要:传统阴阳K-means算法处理大规模聚类问题时计算开销十分昂贵。针对典型众核处理器的体系结构特征,提出了一种阴阳K-means算法高效并行加速实现。该实现基于一种新内存数据布局,采用众核处理器中的向量单元来加速阴阳K-means中的距离计算,并面向非一致内存访问(non-unified memory access, NUMA)特性进行了针对性的访存优化。与阴阳K-means算法的开源多线程实现相比,该实现在ARMv8和x86众核平台上分别获得了最高约5.6与8.7的加速比。因此上述优化方法在众核处理器上成功实现了对阴阳K-means算法的加速。Traditional Yinyang K-means algorithm is computationally expensive when dealing with large-scale clustering problems.An efficient parallel acceleration implementation of Yinyang K-means algorithm was proposed on the basis of the architectural characteristics of typical many-core CPUs.This implementation was based on a new memory data layout,used vector units in many-core CPUs to accelerate distance calculation in Yinyang K-means,and targeted memory access optimization for NUMA(non-uniform memory access)characteristics.Compared with the open source multi-threaded version of Yinyang K-means algorithm,this implementation can achieve the speedup of up to 5.6 and 8.7 approximately on ARMv8 and x86 many-core CPUs,respectively.Experiments show that the optimization successfully accelerate Yinyang K-means algorithm in many-core CPUs.

关 键 词:K-MEANS 非一致内存访问 向量化 众核处理器 性能优化 

分 类 号:TP311.1[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象