视觉-语言多模态下的多任务人脸年龄估计  

Multi-task face age estimation in vision-language multimodality

在线阅读下载全文

作  者:何江 池静[1] 池佳稷 高松 HE Jiang;CHI Jing;CHI Jiaji;GAO Song(School of Information and Electrical Engineering,Hebei University of Engineering,Handan 056038,China;School of Electrical Engineering,Lappeenranta University of Technology,Lappeenranta 53850,Finland;Handan No.3 Construction Engineering Co.,Ltd.,Handan 056001,China)

机构地区:[1]河北工程大学信息与电气工程学院,河北邯郸056038 [2]拉彭兰塔理工大学电气工程学院,南卡累利亚拉彭兰塔53850 [3]邯郸市第三建筑工程有限公司,河北邯郸056001

出  处:《现代电子技术》2024年第14期171-176,共6页Modern Electronics Technique

基  金:邯郸市科学技术研究与发展计划项目(21422031252)。

摘  要:现有的年龄估计方法仅基于人脸图像,无法充分利用图像背后的语言背景信息。此外,这些方法通常专注于单一年龄估计任务的优化,忽略了相近任务带来的提高模型性能的信息。针对上述问题,提出一种基于视觉-语言多模态的多任务人脸年龄估计方法。该方法利用提示文本信息为年龄估计提供更丰富、更准确的图像理解和先验知识。同时,引入多任务学习方法,利用任务间的互补性将年龄分类任务与序数回归任务相结合,以获得更好的性能。最后,为了获得可靠的预测结果,研究了加权平均法和任务回归法两种多任务结果融合方法,并对加权平均法的权重系数进行了消融实验,以期找到一组合适的权重系数。结果表明:与其他先进方法相比,所提方法在UTK-FACE数据集上的平均绝对误差(MAE)降低了7.32%,在MorphⅡ数据集上的MAE降低了1.20%,累积分数(CS)提升了0.11%。Existing age estimation methods are based only on face images and cannot fully utilize the linguistic contextual information behind the images.In addition,these methods usually focus on the optimization of a single age estimation task,ignoring the information brought by similar tasks to improve the model performance.To address the above problems,a multi-task face age estimation method based on vision-language multimodality is proposed,which utilizes prompt text information to provide richer and more accurate image understanding and a priori knowledge for age estimation.Meanwhile,a multi-task learning method is introduced to combine the age classification task with the ordinal regression task by utilizing the complementarity between tasks to obtain better performance.In order to obtain reliable prediction results,two multi-task result fusion methods are investigated:weighted averaging and task regression,and ablation experiments are conducted on the weighting factor of the weighted averaging method to find a suitable set of weighting factors.In comparison with the state-of-the-art methods,the mean absolute error(MAE)of the proposed method is reduced by 7.32%on the UTK-FACE dataset,its MAE is reduced by 1.20%,and its cumulative score(CS)is improved by 0.11%on the MorphⅡdataset.

关 键 词:年龄估计 视觉-语言多模态 多任务学习 加权平均法 提示文本 任务回归器 

分 类 号:TN711-34[电子电信—电路与系统] TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象