检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李沛卓 万雪 李盛阳 LI Pei-zhuo;WAN Xue;LI Sheng-yang(Key Laboratory of Space Utilization,Chinese Academy of Sciences,Technology and Engineering Center for Space Utilization,Chinese Academy of Sciences,University of Chinese Academy of Sciences,Beijing 100094,China)
机构地区:[1]中国科学院大学中国科学院空间应用工程与技术中心中国科学院太空应用重点实验室,北京100094
出 处:《光学精密工程》2021年第12期2944-2955,共12页Optics and Precision Engineering
基 金:中国科学院空间应用中心前瞻性课题重点项目(No.Y8031831WY)。
摘 要:为了让科学家快速定位实验关键过程,获取更为详细的实验过程信息,需要对空间科学实验自动添加描述性文字内容。针对空间科学实验目标较小且数据样本较少的问题,本文提出了基于多模态学习的空间科学实验图像描述算法模型,主要分为四部分:基于改进U-Net的语义分割模型,基于语义分割的空间科学实验词汇候选,自下而上的通用场景图像特征向量提取和基于多模态学习的描述语句生成。此外,本文构建了空间科学实验目标数据集,包括语义掩码标注和图像描述标注,来对空间科学实验进行图像描述。实验结果表明:相对于经典的图像描述模型Neuraltalk2,本文提出的算法在精度评定方面,METEOR结果平均提升了0.089,SPICE结果平均提升了0.174;解决了空间科学实验目标较小、样本较少的难点,构建基于多模态学习的空间科学实验图像描述模型,满足对空间科学实验场景进行专业性、精准性的描述要求,实现从低层次感知到深层场景理解的能力。In order to enable scientists to quickly locate the key process of the experiment and obtain de⁃tailed experimental process information,it is necessary to automatically add descriptive content to space science experiments.Aiming at the problem of small target and small data sample of space science experi⁃ment,this paper proposes the image captioning of space science experiment based on multi-modal learn⁃ing.It is mainly divided into four parts:semantic segmentation model based on improved U-Net,space science experimental vocabulary candidate based on semantic segmentation,general scene image feature vector extraction from bottom-up model and image caption based on multimodal learning.In addition,the dataset of space science experiment is constructed,including semantic masks and image caption annota⁃tions.Experimental results demonstrate that:compared with the state-of-the-art image caption model neuraltalk2,the accuracy evaluation of the proposed algorithm is improved by 0.089 for METEOR and 0.174 for SPICE.It solves the difficulty of small objectives and small data samples of space science experiment.It constructs a model of space science experiment image caption based on multi-modal learning,which meets the requirements of describing space science experiment professionally and accurately,and realizes the ability from low-level sense to deep scene understanding.
分 类 号:TP394.1[自动化与计算机技术—计算机应用技术] TH691.9[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3