检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孙宝刚[1] 何国斌[2] SUN Bao-gang;HE Guo-bin(Chongqing College of Humanities,Science&Technology,Chongqing 401524,China;College of Computer&Information Science Southwest University,Chongqing 400715,China)
机构地区:[1]重庆人文科技学院计算机工程学院,重庆401524 [2]西南大学计算机与信息科学学院,重庆400715
出 处:《计算机仿真》2025年第1期362-366,共5页Computer Simulation
基 金:重庆市教委人文社会科学类研究项目(22SKGH493)。
摘 要:为了避免噪声数据干扰数据挖掘效果,提高数据挖掘的精度和质量,提出融合相似度与随机森林的数据挖掘算法。采用奇异值分解算法分解数据矩阵,获得一系列奇异值,同时引入中位数绝对偏差法在上述奇异值中选取较大的奇异值,利用这些奇异值展开重构,得到去噪后的数据;计算去噪后数据的样本熵,将其作为数据特征,结合P值和特征相似度对数据特征展开筛选,剔除冗余特征,选取最优数据特征;建立极限随机森林,将数据特征输入极限随机森林中,实现数据挖掘。实验结果表明,所提算法在数据挖掘过程中具有较高的查全率、F-measure指标以及AUC值,表明所提算法具有良好的数据挖掘性能。In order to avoid noise data interfering with the effectiveness of data mining and improve the accuracy and quality of data mining,a data mining algorithm that integrates similarity and random forest is proposed.At firstly,the singular value decomposition(SVD)algorithm was used to decompose the data matrix,thus obtaining a series of singular values.At the same time,the median absolute deviation(MAD)method was introduced to select the larger singular values.Then,these singular values were reconstructed to obtain the denoised data.Moreover,sample entropy of the denoised data was calculated as data feature.Based on P-value and feature similarity,features were filtered.After that,redundant features were eliminated,and the optimal data features were determined.Finally,an extreme random forest was constructed,and then the data features were input into the forest.Thus,the data mining was completed.The experimental results show that the proposed algorithm has a high recall rate,F-measure index,and AUC value in the data mining process,indicating that the algorithm has good data mining performance.
关 键 词:数据相似度 奇异值分解算法 中位数绝对偏差法 极限随机森林 数据挖掘
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.112