检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]国网山西省电力公司晋城供电公司,山西晋城048000 [2]北京国电通网络技术有限公司,北京100070 [3]华北电力大学电气与电子工程学院,北京102206
出 处:《计算机技术与发展》2017年第6期76-80,共5页Computer Technology and Development
基 金:国家自然科学基金资助项目(51507063)
摘 要:电网调度运行过程中产生海量复杂度高的多源异构数据,利用数据挖掘将这些数据转化为知识是调度智能化发展的必然趋势。为此,构建了基于调控大数据的多源异构数据分析模型,提出了一种能够处理大数据的频繁项集挖掘算法,将分布式统计引入到频繁项集挖掘过程。该算法根据组合学原理,利用MapReduce扫描一次数据库从原始事务数据库中完成频繁项集的整个挖掘过程;且在支持度阈值改变的情况下无需重新扫描数据库进行挖掘,改进了现有频繁项集挖掘算法多次扫描事务数据库和挖掘效率低的问题。通过利用Hadoop平台对故障信息事务库进行处理,将所提出的算法与其他频繁项集挖掘算法进行了对比验证实验。实验结果表明,所提出的算法不受支持度阈值的影响,处理海量事务数据算法时间开销小,可为实现以准确、安全、经济等目标综合最优的调度智能化分析和决策提供有益的知识。Power grid dispatching has produced large amount of multi-source heterogeneous data with high complexity, and it is the inevi- table development trend of intelligent dispatching that power data are transformed into knowledge by data mining. An analysis model of multi-source heterogeneous data based on big data in power dispatching and control system has been established and a frequent item set mining algorithm for processing big data has been proposed. The distributed statistics has been introduced into mining frequent item sets. Combining MapReduee programming and combinatories, the target frequent item set mining has been completed via scanning transaction database with the proposed algorithm and thus there is no need to scan database again for mining while support degree is under variation. This algorithm has been promoted to solve the problem of multiple scanning transaction database and low mining efficiency. Compared with other frequent item set mining, the algorithm takes advantage of Hadoop in dealing with fault information transaction database. Ex- perimental results show that the proposed algorithm performs well in expansibility and has less time cost with large transaction database and that the method adopted has provided useful knowledge for intelligent analysis and decision making with comprehensive optimal ob- jectives of accuracy, security, economic and others, which single data source could not achieve.
分 类 号:TP39[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229