区域敏感的场景图生成方法  

Region-sensitive Scene Graph Generation Method

在线阅读下载全文

作  者:王立春 付芳玉[1,2] 徐凯 徐洪波 尹宝才 WANG Lichun;FU Fangyu;XU Kai;XU Hongbo;YIN Baocai(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China;Beijing Key Laboratory of Multimedia and Intelligent Software Technology,Beijing University of Technology,Beijing 100124,China)

机构地区:[1]北京工业大学信息学部,北京100124 [2]北京工业大学多媒体与智能软件技术北京市重点实验室,北京100124

出  处:《北京工业大学学报》2025年第1期51-58,共8页Journal of Beijing University of Technology

基  金:国家自然科学基金资助项目(62376014);中国高校产学研创新基金资助项目(2021JQR023)。

摘  要:针对基于关系边界框提取的谓词特征粒度相对较粗的问题,提出区域敏感的场景图生成(region-sensitive scene graph generation,RS-SGG)方法。谓词特征提取模块将关系边界框分为4个区域,基于自注意力机制抑制关系边界框中与关系分类无关的背景区域。关系特征解码器在进行关系预测时不仅考虑了物体对的视觉特征和语义特征,也考虑了物体对的位置特征。在视觉基因组(visual genome,VG)数据集上分别计算了RS-SGG方法针对场景图生成、场景图分类和谓词分类3个子任务的图约束召回率和无图约束召回率,并与主流的场景图生成方法进行了比较。实验结果表明,RS-SGG的图约束召回率和无图约束召回率均优于主流方法。此外,可视化实验结果也进一步证明了所提出方法的有效性。Aiming at that the granularity of the predicate feature extracted based on relation bounding box is relatively coarse,a region-sensitive scene graph generation(RS-SGG)method is proposed.The predicate feature extraction module divided the relationship bounding box into four regions and used the self-attention mechanism to suppress background regions that were irrelevant to relationship classification.The relationship feature decoder comprehensively employed the visual,semantic and the position features of object pairs for predicting the predicate relationships.Based on the publicly available visual genome(VG)dataset,RS-SGG was compared with some mainstream scene graph generation methods.The graph constraint recall and no graph constraint recall for three subtasks including scene graph detection,scene graph classification,and predicate classification were computed to evaluate the performance of the SGG models.Results show that graph constraint recall and no graph constraint of RS-SGG are better than that of the mainstream methods.Additionally,the results of visualization experiments further demonstrate the effectiveness of the proposed method.

关 键 词:图像理解 场景图生成 物体分类 关系分类 区域感知 自注意力机制 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象