A Parallel Approach to Discords Discovery in Massive Time Series Data  

在线阅读下载全文

作  者:Mikhail Zymbler Alexander Grents Yana Kraeva Sachin Kumar 

机构地区:[1]Department of Computer Science,South Ural State University,Chelyabinsk,454080,Russian

出  处:《Computers, Materials & Continua》2021年第2期1867-1878,共12页计算机、材料和连续体(英文)

基  金:the Russian Foundation for Basic Research(Grant No.20-07-00140);the Ministry of Science and Higher Education of the Russian Federation(Government Order FENU-2020-0022).

摘  要:A discord is a refinement of the concept of an anomalous subsequence of a time series.Being one of the topical issues of time series mining,discords discovery is applied in a wide range of real-world areas(medicine,astronomy,economics,climate modeling,predictive maintenance,energy consumption,etc.).In this article,we propose a novel parallel algorithm for discords discovery on high-performance cluster with nodes based on many-core accelerators in the case when time series cannot fit in the main memory.We assumed that the time series is partitioned across the cluster nodes and achieved parallelization among the cluster nodes as well as within a single node.Within a cluster node,the algorithm employs a set of matrix data structures to store and index the subsequences of a time series,and to provide an efficient vectorization of computations on the accelerator.At each node,the algorithm processes its own partition and performs in two phases,namely candidate selection and discord refinement,with each phase requiring one linear scan through the partition.Then the local discords found are combined into the global candidate set and transmitted to each cluster node.Next,a node performs refinement of the global candidate set over its own partition resulting in the local true discord set.Finally,the global true discords set is constructed as intersection of the local true discord sets.The experimental evaluation on the real computer cluster with real and synthetic time series shows a high scalability of the proposed algorithm.

关 键 词:Time series discords discovery computer cluster many-core accelerator VECTORIZATION 

分 类 号:O41[理学—理论物理]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象