局部注意力与Mogrifier-LSTM的图像描述生成方法

Image caption generation method based on local attention and Mogrifier-LSTM

作　　者：丁云霞时义舒胡鹏胡锐李德权[1] DING Yunxia;SHI Yishu;HU Peng;HU Rui;LI Dequan(School of Artificial Intelligence,Anhui University of Science and Technology,Huainan 232001,China)

机构地区：[1]安徽理工大学人工智能学院,安徽淮南232001

出　　处：《哈尔滨商业大学学报(自然科学版)》2025年第1期3-9,共7页Journal of Harbin University of Commerce:Natural Sciences Edition

基　　金：安徽理工大学校级重点项目(QNZD2021-02);淮南市科技计划项目(2020165,2021005);安徽高校自然科学研究项目(2022AH050801);安徽理工大学引进人才基金(13210679)。

摘　　要：针对公共场景复杂,编码器较难捕捉到场景图像中人-物之间的复杂关系所导致的解码器端无法准确理解图像语义问题,提出基于局部注意力机制与改进长短期记忆网络LAM-LSTM的公共场景图像描述方法.通过引入局部注意力来关注整个场景中重点区域,将捕捉到的关键信息与文本特征向量进行融合,最后输入到改进长短期记忆网络Mogrifier-LSTM中生成图像的自然语言描述.在MSCOCO和Flickr30K两个公开数据集上使用Bleu、Meteor和CIDEr等评价指标对LAM-LSTM进行实验验证,结果表明,该方法相较于基线模型均有不同程度的提升,证明了该方法的有效性.For complex public scenarios,it was more difficult for the encoder to capture image semantics due to the complex relationships between people and objects.A method for public scene image description,based on a local attention mechanism and LAM-LSTM,was proposed.By introducing local attention to focus on areas throughout the scene,the key captured information was fused with text eigenvectors and incorporated into a natural language description,enhancing the image descriptions generated by the Mogrifier-LSTM,a long and short-term memory network.Experimental validation of LAM-LSTM was conducted using evaluation indicators such as Bleu,Meteor,and CIDEr on the MSCOCO and Flickr30K public datasets.The results demonstrated that the method exhibited varying degrees of improvement compared to the baseline model,proving the method s validity.

关键词：公共场景图像理解注意力机制文本特征自然语言描述图像语义

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

局部注意力与Mogrifier-LSTM的图像描述生成方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

局部注意力与Mogrifier-LSTM的图像描述生成方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索