检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李志欣[1] 苏强[1] LI Zhixin;SU Qiang(Guangxi Key Lab of Multi-source Information Mining and Security(Guangxi Normal University),Guilin Guangxi 541004,China)
机构地区:[1]广西多源信息挖掘与安全重点实验室(广西师范大学),广西桂林541004
出 处:《广西师范大学学报(自然科学版)》2022年第5期418-432,共15页Journal of Guangxi Normal University:Natural Science Edition
基 金:国家自然科学基金(61966004,61866004);广西自然科学基金(2019GXNSFDA245018);广西“八桂学者”工程专项基金。
摘 要:为给定图像自动生成符合人类感知的描述语句是人工智能的重要任务之一。大多数现有的基于注意力的方法均探究语句中单词和图像中区域的映射关系,而这种难以预测的匹配方式有时会造成2种模态间不协调的对应,从而降低描述语句的生成质量。针对此问题,本文提出一种文本相关的单词注意力来提高视觉注意力的正确性。这种特殊的单词注意力在模型序列地生成描述语句过程中强调不同单词的重要性,并充分利用训练数据中的内部标注知识来帮助计算视觉注意力。此外,为了揭示图像中不能被机器直接表达出来的隐含信息,将从外部知识图谱中抽取出来的知识注入到编码器—解码器架构中,以生成更新颖自然的图像描述。在MSCOCO和Flickr30k图像描述基准数据集上的实验表明,本方法能够获得良好的性能,并优于许多现有的先进方法。Automatically generating a human-like description for a given image is one of the most important tasks in artificial intelligence.Most of the existing attention-based methods explore the mapping relationships between words in sentence and regions in image.However,the quality of generated captions can be reduced by such unpredictable matching manner which sometimes cause inharmonious alignments.To solve this problem,a new method which uses word attention to improve the correctness of visual attention when generating word-by-word sequential descriptions is proposed.The special word attention emphasizes word importance when focusing on different regions of the input image,and makes full use of the internal annotation knowledge to assist the calculation of visual attention.Furthermore,in order to reveal implied information that cannot be expressed straightforwardly by machines and generate more novel and natural captions,the external knowledge which is extracted from the knowledge graphs is injected to the encoder-decoder framework.Finally,The new method is validated on two available captioning benchmarks i.e.Microsoft COCO dataset and Flickr30k dataset.The experimental results demonstrate that this new approach can achieve a good performance and outperform many of the state-of-the-art approaches.
关 键 词:图像描述生成 内部知识 外部知识 单词注意力 知识图谱 强化学习
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.116.61.213