检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吕萍 常玉慧 钱进 LüPing;Chang Yuhui;Qian Jin(School of Computer Engineering,Jiangsu University of Technology,Changzhou,213001,China;School of Software,East China Jiaotong University,Nanchang,330013,China)
机构地区:[1]江苏理工学院计算机工程学院,常州213001 [2]华东交通大学软件学院,南昌330013
出 处:《南京大学学报(自然科学版)》2022年第4期594-603,共10页Journal of Nanjing University(Natural Science)
基 金:国家自然科学基金(62066014);江苏省“青蓝工程”;江西省“双千计划”;江西省自然科学基金(20202BABL202018)。
摘 要:面向大规模数据的知识约简是近年来粗糙集理论的研究热点.传统的知识约简算法通常将小规模数据一次性装入内存中进行约简,因此无法处理海量数据.此外,采用不同的属性不确定性度量会导致并行知识约简算法效率上的差异.为此,从知识粒度视角研究这些不确定性度量的差异和联系,设计了数据和任务同时并行的Map和Reduce函数来计算不同候选属性子集导出的等价类和属性子集的不确定性,构建了一种知识粒度框架下并行知识约简算法模型来获取一个约简,并在Hadoop平台上进行了相关实验.实验结果表明,这些并行知识约简算法可以有效处理海量数据集.Knowledge reduction for massive datasets has attracted many research interests in rough set theory. Classical knowledge reduction algorithms assume all the datasets can be loaded into the main memory of a single machine,which is infeasible for large-scale data. Meanwhile,different measures of uncertainty largely affect the efficiency of the parallel attribute reduction algorithms. To address this issue,from the perspective of knowledge granularity,this paper systemically studies the interrelationships of classical measures of uncertainty. Then,in order to compute the equivalence classes and attribute significance in parallel on different candidate attribute sets,the Map and Reduce functions are designed and implemented using data and task parallelisms. Finally,the parallel algorithm model using knowledge granularity is constructed for knowledge reduction via MapReduce,which can be used to compute a reduct for the algorithms based on the relatively discernibility relation,the relatively indiscernibility relation and the complementary condition entropy. The experimental results demonstrate that the proposed parallel knowledge reduction algorithms can efficiently process massive datasets on Hadoop platform.
关 键 词:MAPREDUCE 知识约简 数据并行 任务并行 知识粒度
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7