检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郑洁 Zheng Jie(Guiyang Vocational and Technical College,Guiyang 550081,China)
出 处:《无线互联科技》2018年第20期105-106,共2页Wireless Internet Technology
摘 要:随着信息技术的迅速发展,数据信息处理技术已经从原始的文件处理演变到复杂且功能强大的数据库系统,为了将这些数据信息转换成有用的知识信息,数据挖掘技术应运而生。数据集的数据质量低下会导致挖掘出的数据准确率明显降低,数据缺失是数据质量低下最常见的情况,为了提升数据质量,对缺失数据的修复问题是一项值得关注的热点问题。文章对缺失数据的修复主要是讨论连续型属性的数据类型,修复方法采用归因技术的思想,利用属性间的关系,用数据集中的现有值去估计那些相关的缺失值。并对比一种常用且高效的缺失值修复方法,在此基础上通过引入属性的特征权值,加强重要属性对数据修复计算的影响,进一步提高了数据修复的准确率。With the rapid development of information technology,the processing technology of data information has evolved from the original files processing to a complex and powerful database system.In order to convert these data information into useful knowledge,data mining technology has emerged.The data quality of the data set will lead to obviously decrease in the accuracy of the data mined.Missing data is the most common case of low data quality.In order to improve the data quality,the problem of recovering missing data is a hot issue that deserves attention.This paper mainly discusses the data types of continuous attributes,adopts the idea of attribution,and uses the relationship between attributes to estimate the relevant missing values with the existing values in the data set.Based on the comparison of a common and efficient method of missing value recovery,the influence of important attributes on the calculation of data repair is enhanced by introducing the feature weight of the attribute,it will improve the accuracy of data recovery.
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147