检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:方圆 王丽珍[3] 王晓璇[3] 杨培忠 Fang Yuan;Wang Lizhen;Wang Xiaoxuan;Yang Peizhong(School of Mathematics and Statistics,Yunnan University,Kunming 650500;South-Western Institute for Astronomy Research,Yunnan University,Kunming 650500;School of Information Science and Engineering,Yunnan University,Kunming 650500)
机构地区:[1]云南大学数学与统计学院,昆明650500 [2]云南大学西南天文研究所,昆明650500 [3]云南大学信息学院,昆明650500
出 处:《计算机研究与发展》2022年第2期264-281,共18页Journal of Computer Research and Development
基 金:国家自然科学基金项目(61966036,61662086);云南省创新团队基金项目(2018HC019);云南大学博士后基金项目(C176220200)。
摘 要:传统的空间并置模式挖掘旨在发现空间中实例频繁共存的特征子集.目前空间并置模式的大多数研究都将模式的频繁性作为兴趣度度量.然而,在实际应用场景中,用户往往不仅对特征集的频繁性感兴趣,而且对它的完整性也感兴趣.结合并置模式的频繁性和完整性,提出主导空间并置模式(dominant spatial co-location patterns,DSCPs),目的是为用户提供一组高质量的并置模式.具体地,在空间并置模式挖掘任务中引入了模式占有度,以衡量并置模式的完整性.我们通过同时考虑模式的完整性和频繁性形式化了主导并置模式挖掘的问题.设计了一个挖掘主导并置模式的基本算法,为了降低计算开销,提出了一系列的剪枝策略及新颖的数据结构改进基本算法的挖掘效率.在合成数据集和真实数据集上进行了实验,评估了所提出算法的效率和有效性,验证了剪枝策略能够大幅提高算法效率.在实际应用中的挖掘结果表明了主导并置模式挖掘的合理性和可用性.Traditional spatial co-location pattern mining aims to discover the subset of spatial feature set whose instances are prevalently located together in geographic neighborhoods.Most previous studies take the prevalence of patterns as an interestingness measure.However,It may well be that users are not only interested in identifying the prevalence of a feature set,but also its completeness,namely the portion of co-location instances that a pattern occupies in their neighborhood.Combining the prevalence and completeness of co-location patterns,we can provide users with a set of higher quality co-location patterns called dominant spatial co-location patterns(DSCPs).In this paper,we introduce an occupancy measure into the spatial co-location pattern mining task to measure the completeness of co-location patterns.Then we formulate the problem of DSCPs mining by considering both the completeness and prevalence.Thirdly,we present a basic algorithm for discovering DSCPs.In order to reduce the high computational cost,a series of pruning strategies are given to improve the algorithm efficiency.Finally,the experiments are conducted both on synthetic and real-world data sets,and the efficiency and effectiveness of the proposed algorithms are evaluated.The running time on synthetic data sets shows our pruning strategies are efficient.The mining results in two real-world applications demonstrate that DSCPs are reasonable and acceptable.
关 键 词:空间数据挖掘 主导并置模式 占有度度量 频繁性度量 空间关联规则
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.133.112.141