检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:沈忧 何勇[1,2] 彭安浪[3] SHEN Chen;HE Yong;PENG Anlang(State Key Laboratory of Public Big Data,Guizhou University,Guiyang 550025,Guizhou,China;College of Computer Science and Technology,Guizhou University,Guiyang 550025,Guizhou,China;Guizhou Zhaoxin Digital Technology Co.,Ltd.,Guiyang 550025,Guizhou,China)
机构地区:[1]贵州大学公共大数据国家重点实验室,贵州贵阳550025 [2]贵州大学计算机科学与技术学院,贵州贵阳550025 [3]贵州兆信数码技术有限公司,贵州贵阳550025
出 处:《计算机工程》2025年第4期107-118,共12页Computer Engineering
基 金:贵州省科技支撑计划项目(黔科合支撑[2022]一般267)。
摘 要:在物联网(IoT)场景中,数据在采集和传输过程中易受噪声的干扰,导致数据中存在一定的离群值与缺失值。现有的时间正则化矩阵分解模型通常考虑平方损失来衡量重构误差,忽略了处理存在异常数据的多维时间序列时,矩阵分解的质量同样是影响模型预测性能的关键因素。提出一种基于L_(2,log)范数的时间感知鲁棒非负矩阵分解多维时序预测框架(TARNMF)。TARNMF通过非负矩阵分解(NMF)和参数可学习的自回归(AR)时间正则项建立多维时序数据的时空相关性,基于存在离群值的数据服从拉普拉斯分布的假设,使用L_(2,log)范数来估计非负鲁棒矩阵分解中原始数据和重建矩阵的误差,以减小异常数据对预测模型的干扰。L_(2,log)范数具备现有鲁棒度量函数的性质,解决了L_(1)损失的近似问题,并通过压缩异常值的残差来减少其对目标函数的影响。此外,提出一种基于投影梯度下降的优化方法对模型进行优化。实验结果表明,TARNMF具有良好的可扩展性和鲁棒性,尤其在高维Solar数据集上,较次优结果的相对平均绝对误差降低了8.64%。同时,在噪声数据上的实验结果验证了TARNMF能高效地处理和预测存在异常数据的IoT时序数据。In Internet of Things(IoT)scenarios,data are susceptible to noise during collection and transmission,resulting in outliers and missing data.Existing temporal regularized matrix factorization models typically consider the squared loss as a measure of reconstruction errors,ignoring the fact that the quality of matrix factorization is also a key factor affecting a model's prediction performance when dealing with multidimensional time series in the presence of anomalous data.Therefore,this paper proposes a Time Aware Robust Non-negative Matrix Factorization multidimensional temporal prediction framework(TARNMF)based on the L_(2,log) norm.TARNMF establishes the spatiotemporal correlation of multidimensional time series data through Nonnegative Matrix Factorization(NMF)and autoregressive temporal regular terms with learnable parameters.In the presence of outliers,data obey the Laplace distribution.Based on this assumption,the L_(2,log) norm is used to estimate the error between the original data and the reconstructed matrices in the nonnegative robust matrix factorization to minimize the interference of the anomalous data on the prediction model.The L_(2,log) norm is as robust as existing metric functions,solves the problem of approximating the L_(1),loss,and reduces its effect on the objective function by compressing the residuals of the outliers.The paper also proposes a projected gradient descent-based optimization method to optimize the model.Experiments on a high-dimensional Solar dataset show that TARNMF is scalable and robust,and the relative mean absolute error of the suboptimal results is reduced by 8.64%.Meanwhile,results on noisy data verify that TARNMF can efficiently process and predict IoT time series data in the presence of anomalous data.
关 键 词:L_(2 log)范数 非负矩阵分解 时间正则化矩阵分解 多维时序数据预测 鲁棒性
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171