机构地区:[1]School of Electronics and Information Engineering,Harbin Institute of Technology,Harbin 150001,China [2]Technology and Engineering Center for Space Utilization,Chinese Academy of Sciences,Beijing 100094,China [3]The Key Laboratory of Space Utilization,Chinese Academy of Sciences,Beijing 100094,China
出 处:《Science China(Information Sciences)》2020年第4期93-107,共15页中国科学(信息科学)(英文版)
基 金:supported by National Natural Science Foundation of Key International Cooperation(Grant No.61720106002);Key Research and Development Project of Ministry of Science and Technology(Grant No.2017YFC1405100);National Natural Science Foundation of China(Grant No.61901141);Fundamental Research Funds for the Central Universities(Grant No.HIT.HSRIF.2020010)。
摘 要:Satellite video scene classification(SVSC)is an advanced topic in the remote sensing field,which refers to determine the video scene categories from satellite videos.SVSC is an important and fundamental step for satellite video analysis and understanding,which provides priors for the presence of objects and dynamic events.In this paper,a two-stage framework is proposed to extract spatial features and motion features for SVSC.More specifically,the first stage is designed to extract spatial features for satellite videos.Representative frames are firstly selected based on the blur detection and spatial activity of satellite videos.Then the fine-tuned visual geometry group network(VGG-Net)is transferred to extract spatial features based on spatial content.The second stage is designed to build motion representation for satellite videos.The motion representation of moving targets in satellite videos is first built by the second temporal principal component of principal component analysis(PCA).Second,features from the first fully connected layer of VGG-Net are used as high-level spatial representation for moving targets.Third,a small network of long and short term memory(LSTM)is further designed for encoding temporal information.Two-stage features respectively characterize spatial and temporal patterns of satellite scenes,which are finally fused for SVSC.A satellite video dataset is built for video scene classification,including 7209 video segments and covering 8 scene categories.These satellite videos are from Jilin-1 satellites and Urthecast.The experimental results show the efficiency of our proposed framework for SVSC.Satellite video scene classification(SVSC) is an advanced topic in the remote sensing field, which refers to determine the video scene categories from satellite videos. SVSC is an important and fundamental step for satellite video analysis and understanding, which provides priors for the presence of objects and dynamic events. In this paper, a two-stage framework is proposed to extract spatial features and motion features for SVSC. More specifically, the first stage is designed to extract spatial features for satellite videos.Representative frames are firstly selected based on the blur detection and spatial activity of satellite videos.Then the fine-tuned visual geometry group network(VGG-Net) is transferred to extract spatial features based on spatial content. The second stage is designed to build motion representation for satellite videos.The motion representation of moving targets in satellite videos is first built by the second temporal principal component of principal component analysis(PCA). Second, features from the first fully connected layer of VGG-Net are used as high-level spatial representation for moving targets. Third, a small network of long and short term memory(LSTM) is further designed for encoding temporal information. Two-stage features respectively characterize spatial and temporal patterns of satellite scenes, which are finally fused for SVSC.A satellite video dataset is built for video scene classification, including 7209 video segments and covering 8 scene categories. These satellite videos are from Jilin-1 satellites and Urthecast. The experimental results show the efficiency of our proposed framework for SVSC.
关 键 词:SATELLITE VIDEOS CLASSIFICATION convolutional neural network CNN long and short TERM memory LSTM MOTION REPRESENTATION
分 类 号:TP751[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...