基于语义加权的双层LSTM图像描述生成方法研究  

IMAGE CAPTION GENERATION METHOD OF A TWO-LAYER LSTM BASED ON SEMANTIC WEIGHTING

在线阅读下载全文

作  者:邵景晨 柴玉梅[1] 王黎明[1] Shao Jingchen;Chai Yumei;Wang Liming(School of Information Engineering,Zhengzhou University,Zhengzhou 450001,Henan,China)

机构地区:[1]郑州大学信息工程学院,河南郑州450001

出  处:《计算机应用与软件》2024年第10期155-162,共8页Computer Applications and Software

基  金:NSFC-通用技术基础研究联合基金项目(U1636111)。

摘  要:为了克服当前一些模型对图像语义信息使用不充分以及没有特定场划分景的问题,提出SW-2LSTM图像描述方法。构建基于ResNet-LSTM网络的模型,加入线性层和BN层,并预处理图像描述得到相应标签。提取图像标签生成向量直接作用于权重矩阵,将原权重矩阵扩展为一个与标签相关的权重矩阵集合,采用张量分解思想将其分解,并添加集束搜索算法。最后将MS COCO数据集在基本类别上进行场景分类。实验结果表明提出的模型可以有效地提高生成描述的质量。In order to overcome the problems that some current models do not fully use the semantic information of images and do not have specific scene division,an image caption method named SW-2LSTM is proposed.A model based on the ResNet-LSTM network was constructed,and the linear layer and the BN layer were added.And image caption was processed to get corresponding tags.Image tags were extracted to generate tag vectors,which were directly applied to the weight matrix,and the original weight matrix was extended to a set of weight matrices related to tags.The weight set was decomposed by using tensor decomposition,and the bean search algorithm was added.The MS COCO data set was classified on its basic categories.Experimental results show that the model can effectively improve the quality of generating caption.

关 键 词:图像描述 深度学习 长短时记忆网络 图像特征 标签 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象