Collaborative Learning Method for Natural Image Captioning  

在线阅读下载全文

作  者:Rongzhao Wang Libo Liu 

机构地区:[1]School of Information Engineering,Ningxia University,Yinchuan,China

出  处:《国际计算机前沿大会会议论文集》2022年第1期249-261,共13页International Conference of Pioneering Computer Scientists, Engineers and Educators(ICPCSEE)

基  金:supported by grant of no.61862050 from the National Nature Science Foundation of China and no.2020AAC03031 from Natural Science Foundation of Ningxia,China.

摘  要:We propose a collaborative learning method to solve the natural image captioning problem.Numerous existing methods use pretrained image classification CNNs to obtain feature representations for image caption generation,which ignores the gap in image feature representations between different computer vision tasks.To address this problem,our method aims to utilize the similarity between image caption and pix-to-pix inverting tasks to ease the feature representation gap.Specifically,our framework consists of two modules:1)The pix2pix module(P2PM),which has a share learning feature extractor to extract feature representations and a U-net architecture to encode the image to latent code and then decodes them to the original image.2)The natural language generation module(NLGM)generates descriptions from feature representations extracted by P2PM.Consequently,the feature representations and generated image captions are improved during the collaborative learning process.The experimental results on the MSCOCO 2017 dataset prove the effectiveness of our approach compared to other comparison methods.

关 键 词:Image captioning Pix2pix inverting Collaborative learning 

分 类 号:G63[文化科学—教育学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象