检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:庄志刚 许青林[1] ZHUANG Zhi-gang;XU Qing-lin(School of Computer,Guangdong University of Technology,Guangzhou 510000,China)
出 处:《计算机科学》2020年第4期136-141,共6页Computer Science
基 金:广东省科技计划项目(2016B030306003)。
摘 要:场景图为描述图像内容的结构图(Graph),其在生成过程中存在两个问题:1)二步式场景图生成方法造成有益信息流失,使得任务难度提高;2)视觉关系长尾分布使得模型发生过拟合、关系推理错误率上升。针对这两个问题,文中提出结合多尺度特征图和环型关系推理的场景图生成模型SGiF(Scene Graph in Features)。首先,计算多尺度特征图上的每一特征点存在视觉关系的可能性,并将存在可能性高的特征点特征提取出来;然后,从被提取出的特征中解码得到主宾组合,根据解码结果的类别差异,对结果进行去重,以此得到场景图结构;最后,根据场景图结构检测包含目标关系边在内的环路,将环路上的其他边作为计算调整因子的输入,以该因子调整原关系推理结果,并最终完成场景图的生成。实验设置SGGen和PredCls作为验证项,在大型场景图生成数据集VG(Visual Genome)子集上的实验结果表明,通过使用多尺度特征图,相比二步式基线,SGiF的视觉关系检测命中率提升了7.1%,且通过使用环型关系推理,相比非环型关系推理基线,SGiF的关系推理命中率提升了2.18%,从而证明了SGiF的有效性。The scene graph is a graph describing image content.There are two problems in its generation:one is the loss of useful information caused by two-step scene graph generation method,which promotes the difficulty of this working,and the second is the model overfitting due to the long-tail distribution of visual relationship,which increases the error rate of relationship reasoning.To solve these two problems,a scene graph generation model SGiF(Scene Graph in Features)based on multi-scale feature map and ring-type relationship reasoning was proposed.Firstly,the possibility of visual relationship is calculated for each feature point on the multi-scale feature map and the features with high possibility are extracted.Then,the subject-object combination is decoded from extracted features.According to the difference of the decoding result category,the result will be deduplicated and the scene graph structure will be obtained.Finally,the ring including the targeted relationship edge is detected according to the graph structure,then the other edges of this ring are used as input of the calculation about factor to adjust the original relationship reasoning result,at last,the scene graph generation work is completed.In this paper,SGGen and PredCls were used as verification items.The experimental results on the subset of large dataset VG(Visual Genome)used for scene graph generation show that,by using multi-scale feature map,SGiF improves the hit rate of visual relationship detection by 7.1%compared with the two-step baseline,and by using the ring-type relationship reasoning,SGiF improves the accuracy of relational reasoning by 2.18%compared with the baseline with non-ring relational reasoning,thus proving the effectiveness of SGiF.
关 键 词:场景图生成 多尺度特征图 环型关系推理 卷积神经网络 图像理解
分 类 号:TP389.1[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15