检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张施明 陈智谦 米金鹏 Shiming Zhang;Zhiqian Chen;Jinpeng Mi(Institute of Machine Intelligence,University of Shanghai for Science and Technology,Shanghai)
机构地区:[1]上海理工大学机器智能研究院,上海
出 处:《建模与仿真》2025年第2期236-244,共9页Modeling and Simulation
基 金:国家自然科学基金(62106026,62272170,42130112);上海市自然科学基金面上项目(23ZR1419300)。
摘 要:指代视频目标分割(Referring Video Object Segmentation,RVOS)是一项新兴的多模态任务,旨在通过理解给定指代表达的语义来分割视频片段中的目标区域。然而,基准数据集的标注是通过半监督方式收集的,仅提供了视频第一帧的真实目标掩码。为了在一个更综合的框架中探索未标记数据中的隐藏知识,本文引入了在线伪标签来解决RVOS问题。具体来说,使用之前训练阶段的即时学习检查点作为教师模型,在未标记的视频帧上生成伪标签,并将获得的伪标签用作训练数据的增强,以监督随后的训练阶段。为了避免伪标签带来的混淆,本文提出了一种不确定性感知的细化策略,根据模型预测的置信度自适应地修正生成的伪标签。本文在基准数据集Refer-YouTube-VOS和Refer-DAVIS17上进行了广泛的实验来验证所提出的方法。实验结果表明,本文的模型与最先进的模型相比取得了具有竞争力的结果。Referring video object segmentation(RVOS)is an emerging multimodal task aiming to segment target regions in video clips by understanding the semantics of given referring expressions.While the annotations of the benchmark datasets are collected in a semi-supervised manner,which only provides the ground truth object masks on the first frame of videos.To explore the concealed knowledge in the unlabeled data in a more integrated framework,we introduce online pseudo-labeling to address RVOS.Specifically,we employ the on-the-fly learned checkpoints in the previous training epochs as the teacher model to produce the pseudo labels on the unlabeled video frames,and the obtained pseudo-labels are utilized as augmentation for the training data to supervise the subsequent training stage.To avert the confusion derived from pseudo-labels,we propose an uncertainty-aware refinement strategy to adaptively rectify the generated pseudo-labels based on the model prediction confidence.We conduct extensive experiments on the benchmark datasets Refer-YouTube-VOS and Refer-DAVIS17 to validate the proposed approach.The experimental results demonstrate that our model achieves competitive results compared with state-of-the-art models.
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33