机构地区:[1]华南师范大学计算机学院,广州510631 [2]达而观信息科技(上海)有限公司研发部,上海201203
出 处:《计算机学报》2023年第10期2161-2177,共17页Chinese Journal of Computers
基 金:国家自然科学基金(62172166,U1811263,61772366);广东省基础与应用基础研究基金(2022A1515011380);上海市自然科学基金(17ZR1445900)资助.
摘 要:基于深度学习的个性化新闻推荐方法通常采用全量更新训练模型.然而,全量更新需要不断整合新数据形成新的训练集,虽然可以保障模型性能,但训练效率低下.另外,出于数据隐私和存储考虑,现实场景下的应用通常不会保留所有历史数据导致全量更新难以为继.增量学习是目前广泛采用的有效解决方法.然而,基于增量学习的新闻推荐模型也存在着新的挑战——灾难性遗忘问题,常见的解决策略有基于正则化和基于回放的方法.基于正则化的方法局限于个体样本在新任务中学习到的特征和原始网络的响应特征之间的对齐或空间几何结构匹配,缺乏全局视觉.基于回放的方法重放过往任务数据,可能导致数据隐私泄漏.为了解决以上不足,本文提出了基于最优传输和知识回放(Optimal Transport and Knowledge Replay)的新闻推荐模型增量学习方法OT-KR.OT-KR方法通过联合分布知识提取器重构联合分布知识特征集合,并且使用最优传输理论在训练过程中最小化新任务和旧任务间的分布差异,确保新模型学习到的域分布可以同时拟合旧任务和新任务,实现知识融合.特别地,为了缓解数据隐私泄漏问题,OT-KR方法仅保存模型参数而非样本作为知识进行回放,同时,借鉴多教师知识蒸馏思想让新任务上的模型可以同时融合所有教师流中的分布信息,并根据任务的学习次序分配权重.通过在公开新闻推荐数据集上开展实验,结果表明OT-KR方法的推荐性能优于基于目前主流增量学习技术的新闻推荐方法,在AUC和NDCG@10两个指标上比目前最优性能平均提高了0.55%和0.47%,同时,能够很好地平衡模型的推荐性能和训练效率.Personalized news recommendation methods based on deep learning usually use full updates to train the model.However,full updates require continuous integration of new data to form a new training set.Although the performance of the news recommendation model can be effectively guaranteed,the training efficiency of the model is low.In addition,due to data privacy and storage considerations,news recommendation applications in real-world scenarios usually do not retain all historical data,making full-update unsustainable.For the problems mentioned above,an effective solution widely adopted at present is incremental learning.However,incremental learning-based news recommendation models also have new challenges,e.g.,catastrophic forgetting.That is,the domain drift caused by the non-stationary distribution of the input domain of the task flow will lead to the catastrophic forgetting of user preference knowledge in the old data domain.Common solution strategies include regularization-based and replay-based methods.Regularization-based methods limit the forgetting of old knowledge by adding regularisation terms to the learning of new tasks but are limited to the alignment or spatial geometric structure matching between features learned by individual samples in a new task and the response features of the original network,which lacks global vision.Replay-based methods maintain the old knowledge of the model by replaying a subset of samples from previous tasks but may lead to data privacy leaks due to the use of past task data.In order to solve the shortcomings of existing methods,this paper proposes an incremental learning method OT-KR for news recommendation models based on optimal transport and knowledge replay.The OT-KR method first calculates a user’s click probability score for the current candidate news based on general news recommendation architectures,and then fuses the news representation,user representation,and click probability score,and reconstructs the joint distribution knowledge feature through a joint distr
关 键 词:新闻推荐 增量学习 最优传输 多教师知识蒸馏 深度学习
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...