k-近邻关系下的空间高效用核模式挖掘  被引量:2

Mining Spatial High Utility Core Patterns under k-Nearest Neighbors

在线阅读下载全文

作  者:罗金 王丽珍[1] 王晓璇 肖清[1] LUO Jin;WANG Li-Zhen;WANG Xiao-Xuan;XIAO Qing(School of Information Science and Engineering,Yunnan University,Kunming 650504)

机构地区:[1]云南大学信息学院,昆明650504

出  处:《计算机学报》2022年第2期354-368,共15页Chinese Journal of Computers

基  金:国家自然科学基金项目(61966036,61662086);云南省创新团队项目(2018HC019)资助.

摘  要:空间数据挖掘旨在从空间数据库中发现和提取有价值的潜在知识.空间co-location(共存)模式挖掘一直以来都是空间数据挖掘领域的重要研究方向之一,其目的是发现一组频繁邻近出现的空间特征子集,而空间高效用co-location模式挖掘则考虑了特征的效用属性.二者在度量空间实例的邻近关系时一般都需要预先给定一个距离阈值d,挖掘算法的效率很受距离阈值d的影响,尤其对分布不均匀的数据集表现不好.另外,传统的空间高效用模式挖掘在分析评估模式的效用时,将模式中所有特征的效用值都计算到模式效用中是不合理的,如在国内5A级景区周围进行高收益商业项目的规划时,项目的预期收益本身不应包含景点的收益.基于上述问题,本文在空间高效用co-location模式挖掘过程中融入了空间k-近邻计算,使得空间实例之间的邻近关系更为客观、合理.进一步地,定义了核元素和核模式等概念,对核模式效用的高低进行了度量,并提出了k-近邻关系下的空间高效用核模式挖掘的通用框架,设计了一个行之有效的基本挖掘算法,考虑到核模式效用度不满足反单调性质,在基本算法之上提出了4个剪枝策略.大量的实验结果表明本文方法挖掘到的空间高效用核模式更具有现实意义,在同等的参数设置下,剪枝优化算法的效率比基本算法至少提高了50%.Spatial data mining aims to help people discover and extract valuable patterns and knowledge from spatial data sets.Spatial co-location pattern mining has always been one of the important research directions in the field of spatial data mining,intending to find subsets of spatial features that often appear close together,while spatial high utility co-location pattern mining takes into account the utility attributes of the features.When measuring the neighbor relationship between spatial instances,both of the above two mining methods usually require a user setting distance threshold of d,the efficiency and effect of the mining algorithms are greatly affected by the distance threshold d.In particular,such algorithms do not work well on unevenly distributed datasets.In addition,when analyzing the pattern utility in the traditional spatial high utility co-location pattern mining,users are not interested in the utility value of some features in a pattern,which should not be calculated into the pattern utility together.Such as when planning of commercial project around 5A-level scenic spots in China to obtain high-yield returns,the expected income of the project itself should not include the income of the scenic spots.That is the spatial high utility pattern obtained by the traditional spatial high utility co-location pattern mining method is not necessarily reliable.Based on the above problems,this paper introduces the k-nearest neighbor relationship into the spatial high utility co-location pattern mining.While solving the problem of setting the distance threshold,making the neighbor relationship between spatial instances is more objective and reasonable.Further,the concepts of core elements and core patterns are defined in the paper.In order to measure the utility of the proposed core pattern,the core participation instance set of features in a core pattern,the core utility participation ratio of features in a core pattern and the core utility index of a core pattern are formally defined.The problem that some featur

关 键 词:空间数据挖掘 空间co-location模式 空间高效用核模式 K-近邻 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象