机构地区:[1]清华大学,北京100084 [2]北京科技大学,北京100083 [3]新疆大学,乌鲁木齐830046
出 处:《中国图象图形学报》2021年第10期2503-2513,共11页Journal of Image and Graphics
基 金:国家自然科学基金项目(U20B2062;61773231);国家重点研发计划项目(2016YFB0100901);北京市科学技术项目(Z191100007419001)。
摘 要:目的前景分割是图像理解领域中的重要任务,在无监督条件下,由于不同图像、不同实例往往具有多变的表达形式,这使得基于固定规则、单一类型特征的方法很难保证稳定的分割性能。针对这一问题,本文提出了一种基于语义—表观特征融合的无监督前景分割方法(semantic apparent feature fusion,SAFF)。方法基于语义特征能够对前景物体关键区域产生精准的响应,但往往产生的前景分割结果只关注于关键区域,缺乏物体的完整表达;而以显著性、边缘为代表的表观特征则提供了更丰富的细节表达信息,但基于表观规则无法应对不同的实例和图像成像模式。为了融合表观特征和语义特征优势,研究建立了融合语义、表观信息的一元区域特征和二元上下文特征编码的方法,实现了对两种特征表达的全面描述。接着,设计了一种图内自适应参数学习的方法,用于计算最适合的特征权重,并生成前景置信分数图。进一步地,使用分割网络来学习不同实例间前景的共性特征。结果通过融合语义和表观特征并采用图像间共性语义学习的方法,本文方法在PASCAL VOC(pattern analysis,statistical modelling and computational learning visual object classes)2012训练集和验证集上取得了显著超过类别激活映射(class activation mapping,CAM)和判别性区域特征融合方法(discriminative regional feature integration,DRFI)的前景分割性能,在F测度指标上分别提升了3.5%和3.4%。结论本文方法可以将任意一种语义特征和表观特征前景计算模块作为基础单元,实现对两种策略的融合优化,取得了更优的前景分割性能。Objective Foreground segmentation is an essential research in the field of image understanding,which is a preprocessing step for saliency object detection,semantic segmentation,and various pixel-level learning tasks.Given an image,this task aims to provide each pixel a foreground or background annotation.For fully supervision-based methods,satisfactory results can be achieved via multi-instance-based learning.However,when facing the problem under unsupervised conditions,achieving a stable segmentation performance based on fixed rules or a single type of feature is difficult because different images and instances always have variable expressions.Moreover,we find that different types of method have different advantages and disadvantages on different aspects.On the one hand,semantic feature-based learning methods could provide accurate key region extraction of foregrounds but could not generate complete object region and edges in detail.On the other hand,richer detailed expression can be obtained based on an apparent feature-based framework,but it cannot be suitable for variable kinds of cases.Method Based on the observations,we propose an unsupervised foreground segmentation method based on semantic-apparent feature fusion.First,given a sample,we encode it as semantic and apparent feature map.We use a class activation mapping model pretrained on Image Net for semantic heat map generation and select saliency and edge maps to express the apparent feature.Each kind of semantic and apparent feature can be used,and the established framework is widely adaptive for each case.Second,to combine the advantages of the two type of features,we split the image as super pixels,and set the expression of four elements as unary and binary semantic and apparent feature,which realizes a comprehensive description of the two types of expressions.Specifically,we build two binary relation matrices to measure the similarity of each pair of super pixels,which are based on apparent and semantic feature.For generating the binary semantic feat
关 键 词:计算机视觉 前景分割 无监督学习 语义—表观特征融合 自然场景图像 PASCAL VOC数据集 自适应加权
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...