基于关系特征强化的全景场景图生成方法  

Panoptic scene graph generation method based on relation feature enhancement

在线阅读下载全文

作  者:李林昊 王逸泽 李英双 董永峰[1,2,3] 王振 LI Linhao;WANG Yize;LI Yingshuang;DONG Yongfeng;WANG Zhen(School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China;Hebei Province Key Laboratory of Big Data Computing(Hebei University of Technology),Tianjin 300401,China;Hebei Data Driven Industrial Intelligent Engineering Research Center(Hebei University of Technology),Tianjin 300401,China)

机构地区:[1]河北工业大学人工智能与数据科学学院,天津300401 [2]河北省大数据计算重点实验室(河北工业大学),天津300401 [3]河北省数据驱动工业智能工程研究中心(河北工业大学),天津300401

出  处:《计算机应用》2025年第2期584-593,共10页journal of Computer Applications

基  金:河北省高等学校自然科学研究项目(QN2023262)。

摘  要:全景场景图生成(PSGG)旨在识别图像中所有对象并自动地捕获所有对象间的语义关联关系。语义关联关系建模依赖目标对象及对象对(subject-object pair)的特征描述,然而现行工作中存在以下不足:采用边界框提取方式获取的对象特征较模糊;仅关注对象的语义和空间位置特征,忽略了对关系预测同样重要的对象对的语义联合特征和相对位置特征;未能针对不同类型的对象对(如前景-前景、前景-背景、背景-背景)进行差异化特征提取,进而忽略了它们之间的差异性。针对上述问题,提出一种基于关系特征强化的全景场景图生成方法(RFE)。首先,通过引入像素级掩码区域特征,丰富对象特征的细节信息,同时有效地融合对象对的联合视觉特征、语义联合特征和相对位置特征;其次,根据对象对的不同类型,自适应地选择最适合本类型对象对的特征提取方式;最后,获得强化后更精确的关系特征用于关系预测。在PSG数据集上的实验结果表明,以VCTree(Visual Contexts Tree)、Motifs、IMP(Iterative Message Passing)和GPSNet为基线方法,ResNet-101为骨干网络,RFE在具有挑战性的SGGen任务上召回率(R@20)指标分别提高了4.37、3.68、2.08和1.80个百分点,验证了所提方法在PSGG的有效性。Panoptic Scene Graph Generation(PSGG)aims to identify all objects within an image and capture the intricate semantic association among them automatically.Semantic association modeling depends on feature description of target objects and subject-object pair.However,current methods have several limitations:object features extracted through bounding box extraction are ambiguous;the methods only focus on the semantic and spatial position features of objects,while ignoring the semantic joint features and relative position features of subject-object pair,which are equally essential for accurate relation predictions;current methods fail to extract features of different types of subject-object pair(e.g.,foreground-foreground,foreground-background,background-background)differentially,ignoring their inherent differences.To address these challenges,a PSGG method based on Relation Feature Enhancement(RFE)was proposed.Firstly,by introducing pixel-level mask regional features,the detailed information of object features was enriched,and the joint visual features,semantic joint features,and relative position features of subject-objects were integrated effectively.Secondly,depending on the specific type of subject-object,the most suitable feature extraction method was selected adaptively.Finally,more accurate relation features after enhancement were obtained for relation prediction.Experimental results on the PSG dataset demonstrate that with VCTree(Visual Contexts Tree),Motifs,IMP(Iterative Message Passing),and GPSNet as baseline methods,and ResNet-101 as the backbone network,RFE achieves increases of 4.37,3.68,2.08,and 1.80percentage points,respectively,in R@20 index for challenging SGGen tasks.The above validates the effectiveness of the proposed method in PSGG.

关 键 词:全景场景图生成 对象对联合特征 关系特征强化 语义关联关系 自适应选择 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象