基于多模态学习的空间科学实验图像描述被引量：2

Image caption of space science experiment based on multi-modal learning

作　　者：李沛卓万雪李盛阳 LI Pei-zhuo;WAN Xue;LI Sheng-yang(Key Laboratory of Space Utilization,Chinese Academy of Sciences,Technology and Engineering Center for Space Utilization,Chinese Academy of Sciences,University of Chinese Academy of Sciences,Beijing 100094,China)

机构地区：[1]中国科学院大学中国科学院空间应用工程与技术中心中国科学院太空应用重点实验室,北京100094

出　　处：《光学精密工程》2021年第12期2944-2955,共12页Optics and Precision Engineering

基　　金：中国科学院空间应用中心前瞻性课题重点项目(No.Y8031831WY)。

摘　　要：为了让科学家快速定位实验关键过程,获取更为详细的实验过程信息,需要对空间科学实验自动添加描述性文字内容。针对空间科学实验目标较小且数据样本较少的问题,本文提出了基于多模态学习的空间科学实验图像描述算法模型,主要分为四部分:基于改进U-Net的语义分割模型,基于语义分割的空间科学实验词汇候选,自下而上的通用场景图像特征向量提取和基于多模态学习的描述语句生成。此外,本文构建了空间科学实验目标数据集,包括语义掩码标注和图像描述标注,来对空间科学实验进行图像描述。实验结果表明:相对于经典的图像描述模型Neuraltalk2,本文提出的算法在精度评定方面,METEOR结果平均提升了0.089,SPICE结果平均提升了0.174;解决了空间科学实验目标较小、样本较少的难点,构建基于多模态学习的空间科学实验图像描述模型,满足对空间科学实验场景进行专业性、精准性的描述要求,实现从低层次感知到深层场景理解的能力。In order to enable scientists to quickly locate the key process of the experiment and obtain de⁃tailed experimental process information,it is necessary to automatically add descriptive content to space science experiments.Aiming at the problem of small target and small data sample of space science experi⁃ment,this paper proposes the image captioning of space science experiment based on multi-modal learn⁃ing.It is mainly divided into four parts:semantic segmentation model based on improved U-Net,space science experimental vocabulary candidate based on semantic segmentation,general scene image feature vector extraction from bottom-up model and image caption based on multimodal learning.In addition,the dataset of space science experiment is constructed,including semantic masks and image caption annota⁃tions.Experimental results demonstrate that:compared with the state-of-the-art image caption model neuraltalk2,the accuracy evaluation of the proposed algorithm is improved by 0.089 for METEOR and 0.174 for SPICE.It solves the difficulty of small objectives and small data samples of space science experiment.It constructs a model of space science experiment image caption based on multi-modal learning,which meets the requirements of describing space science experiment professionally and accurately,and realizes the ability from low-level sense to deep scene understanding.

关键词：空间科学实验图像描述语义分割多模态学习

分类号：TP394.1[自动化与计算机技术—计算机应用技术] TH691.9[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多模态学习的空间科学实验图像描述被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多模态学习的空间科学实验图像描述 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于多模态学习的空间科学实验图像描述被引量：2