基于多维稀疏表示的空气质量指数数据补全  

Data Completion of Air Quality Index Based on Multi-dimensional Sparse Representation

在线阅读下载全文

作  者:蔡启铨 卢举鸿 於志勇[1,2] 黄昉菀[1,2] CAI Qiquan;LU Juhong;YU Zhiyong;HUANG Fangwan(College of Computer and Data Science,Fuzhou University,Fuzhou 350108,China;Fujian Key Laboratory of Network Computing and Intelligent Information Processing(Fuzhou University),Fuzhou 350108,China)

机构地区:[1]福州大学计算机与大数据学院,福州350108 [2]福建省网络计算与智能信息处理重点实验室(福州大学),福州350108

出  处:《计算机科学》2023年第8期52-57,共6页Computer Science

基  金:国家自然科学基金(61772136);福建省引导性项目(2020H0008);福建省中青年教师教育科研项目(JAT210007)。

摘  要:近年来,日益严重的空气污染正成为影响人们身体健康的危险因素之一。空气质量指数数据可以为政府提供大气环境变化的规律,也可以用于对大气污染的控制和管理。但该数据在采集的过程中不可避免地存在缺失,导致了对其进行数据挖掘的难度升高。为了更加充分地利用已经搜集到的数据,对缺失数据进行补全是非常必要的。然而,现有的补全方法往往在高缺失率情况下表现不佳。基于此提出将缺失矩阵补全问题转换为稀疏矩阵重构问题,并设计了一种基于多维稀疏表示的数据补全方法。该方法首先利用训练数据模拟各种随机缺失情况并用于过完备字典的学习,然后利用学习后字典的上半部分获得具有缺失值的矩阵的稀疏表示,最后将该稀疏表示与字典的下半部分相结合得到重构后的估计矩阵。实验结果表明,所提方法在多维时序空气质量指数数据补全问题上优于传统的矩阵补全方法,尤其是在数据缺失比较严重的情况下具有明显的优势。In recent years,air pollution has become increasingly serious and become one of the risk factors affecting people's health.The air quality index(AQI)can provide the government with the laws of atmospheric environment changes,and can also be used for air pollution control.However,the data is inevitably missing in the process of collection,which leads to the difficulty of data mining.However,given the poor performance of existing completion methods under a high miss rate,this paper transforms the missing-matrix-completion problem into a sparse-matrix-reconstruction problem and designs a data completion method based on multi-dimensional sparse representation.The method first uses the training data to simulate various random missing cases for over-complete dictionary learning.Then,the sparse representation of the matrix with missing values is obtained by using the upper part of the learned dictionary.Finally,the sparse representation is combined with the lower part of the dictionary to obtain the reconstructed estimation matrix.Experimental results show that the proposed algorithm is superior to the traditional matrix method in the completion of multi-dimensional time series of AQI,especially in the case of serious missing.

关 键 词:空气质量指数 缺失数据 矩阵补全 字典学习 多维稀疏表示 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象