检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Yimin FU Zhunga LIU Zicheng WANG
机构地区:[1]School of Automation,Northwestern Polytechnical University,Xi'an 710072,China
出 处:《Science China(Information Sciences)》2024年第6期293-308,共16页中国科学(信息科学)(英文版)
基 金:supported in part by National Natural Science Foundation of China(Grant No.U20B2067);Innovation Foundation for Doctor Dissertation of Northwestern Polytechnical University(Grant No.CX2023015);Cultivation Foundation for Excellent Doctoral Dissertation of the School of Automation of Northwestern Polytechnical University。
摘 要:Robust open-set recognition(OSR)performance has become a prerequisite for pattern recognition systems in real-world applications.However,the existing OSR methods are primarily implemented on the basis of single-modal perception,and their performance is limited when single-modal data fail to provide sufficient descriptions of the objects.Although multimodal data can provide more comprehensive information than single-modal data,the learning of decision boundaries can be affected by the feature representation gap between different modalities.To effectively integrate multimodal data for robust OSR performance,we propose logit prototype learning(LPL)with active multimodal representation.In LPL,the input multimodal data are transformed into the logit space,enabling a direct exploration of intermodal correlations without the impact of scale inconsistency.Then,the fusion weights of each modality are determined using an entropybased uncertainty estimation method.This approach realizes adaptive adjustment of the fusion strategy to provide comprehensive descriptions in the presence of external disturbances.Moreover,the single-modal and multimodal representations are jointly optimized interactively to learn discriminative decision boundaries.Finally,a stepwise recognition rule is employed to reduce the misclassification risk and facilitate the distinction between known and unknown classes.Extensive experiments on three multimodal datasets have been done to demonstrate the effectiveness of the proposed method.
关 键 词:logit prototype learning multimodal perception open-set recognition uncertainty estimation
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.200