融合多模态特征的可解释推荐算法  

Explainable Recommendation Algorithm Fusing Multimodal Features

在线阅读下载全文

作  者:王紫萱 张凯涵 蔡江辉 郭青松 徐鑫芳 WANG Zi-Xuan;ZHANG Kai-Han;CAI Jiang-Hui;GUO Qing-Song;XU Xin-Fang(School of Computer Science and Technology,North University of China,Taiyuan 030051,China)

机构地区:[1]中北大学计算机科学与技术学院,太原030051

出  处:《计算机系统应用》2025年第3期62-71,共10页Computer Systems & Applications

基  金:国家自然科学基金(72171137,62401525);山西省基础研究计划(202203021222075,202203021211331)。

摘  要:可解释推荐算法利用行为信息和其他相关信息不仅生成推荐结果而且提供推荐理由,从而增加推荐的透明度和可信度.传统的可解释推荐算法往往局限于分析评分数据和文本数据,对图像这类数据利用并不充分,且并没有很好地考虑模态间的有效融合方式,难以充分挖掘不同模态之间的内在关联.针对上述问题,提出一种融合多模态特征的可解释推荐模型,该模型采用特征融合技术,从多模态角度提高推荐解释的质量与个性化.首先,设计多模态特征提取方法,基于CLIP图像编码器和文本编码器分别提取用户和物品的文本特征和图像特征.其次,采用交叉注意力技术实现文本和图像的跨模态融合,增强模态间的语义相关性.最后,将多模态信息与交互信息结合,联合优化模态对齐、评分预测与解释生成任务.实验结果表明,所提出的方法在3个多模态推荐数据集上都表现出了明显优势,尤其在提升解释质量方面.Explainable recommendation algorithms utilize behavioral and other relevant information to not only generate recommendation results but also provide recommendation explanations,thereby increasing the transparency and credibility of recommendations.Traditional explainable recommendation algorithms are often limited to analyzing rating data and text data and fail to fully utilize data such as images.They also do not consider effective fusion methods between modalities,making it difficult to fully unearth the intrinsic relationships between different modalities.An explainable recommendation model that fuses multimodal features is proposed to address the above-mentioned issues.This model improves the quality and personalization of recommendation explanations from a multimodal perspective through feature fusion technology.Firstly,a multimodal feature extraction method is designed based on CLIP image encoder and text encoder to extract text and image features of users and items,respectively.Secondly,cross attention technology is used to achieve cross modal fusion of text and images,enhancing semantic correlation between modalities.Finally,multimodal information is combined with interactive information to jointly optimize modal alignment,rating prediction,and explanation generation.Experimental results show that the proposed method exhibits significant advantages in the three multimodal recommendation datasets,especially in improving explanation quality.

关 键 词:可解释推荐 多模态 特征融合 交叉注意力 模态对齐 

分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象