检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵强 王中卿[1] 王红玲[1] ZHAO Qiang;WANG Zhongqing;WANG Hongling(School of Computer Science and Technology,Soochow University,Suzhou Jiangsu 215006,China)
机构地区:[1]苏州大学计算机科学与技术学院,江苏苏州215006
出 处:《计算机应用》2024年第1期73-78,共6页journal of Computer Applications
基 金:国家自然科学基金资助项目(61976146)。
摘 要:在网络购物平台上,简洁、真实、有效的产品摘要对于提升购物体验至关重要。网上购物无法接触到产品实物,产品图像所含信息是除产品文本描述外的重要视觉信息,因此融合包括产品文本和产品图像在内的多模态信息的产品摘要对于网络购物具有重要的意义。针对融合产品文本描述和产品图像的问题,提出一种融合多模态信息的产品摘要抽取模型。与一般的产品摘要任务的输入只包含产品文本描述不同,该模型引入了产品图像作为一种额外的信息来源,使抽取产生的摘要更丰富。具体来说,首先对产品文本描述和产品图像分别使用预训练模型进行特征表示,从产品文本描述中提取每个句子的文本特征表示,从产品图像中提取产品整体的视觉特征表示;然后使用基于低阶张量的多模态融合方法将每个句子的文本特征和整体视觉特征进行模态融合,得到每个句子的多模态特征表示;最后将所有句子的多模态特征表示输入摘要生成器中以生成最终的产品摘要。在CEPSUM(Chinese E-commerce Product SUMmarization)2.0数据集上进行对比实验,在CEPSUM 2.0的3个数据子集上,该模型的平均ROUGE-1比TextRank高3.12个百分点,比BERTSUMExt(BERT SUMmarization Extractive)高1.75个百分点。实验结果表明,该模型融合产品文本和图像信息对于产品摘要是有效的,在ROUGE评价指标上表现良好。On online shopping platforms,concise,authentic and effective product summarizations are crucial to improving the shopping experience.In addition,online shopping cannot touch the actual product,and the information contained in the product image is important visual information except the product text description,so product summarization that fuses multimodal information including product text and product image is of great significance for online shopping.Aiming at fusing product text descriptions and product images,a product summarization extraction model with multimodal information fusion was proposed.Different from the general product summarization task whose input only contains the product text description,the proposed model introduces product image as an additional source of information to make the extracted summary richer.Specifically,first the pre-trained model was used to represent the features of the product text description and product image by which the text feature representation of each sentence was extracted from the product text description,and the overall visual feature representation of the product was extracted from the product image.Then the lowrank tensor-based multimodal fusion method was used to modally fuse the text features and overall visual features to obtain the multimodal feature representation for each sentence.Finally,the multimodal feature representations of all sentences were fed into the summary generator to generate the final product summarization.Comparative experiments were conducted on CEPSUM 2.0(Chinese E-commerce Product SUMmarization 2.0)dataset.On the three subsets of CEPSUM 2.0,the average ROUGE-1(Recall-Oriented Understudy for Gisting Evaluation 1)of this model is 3.12 percentage points higher than that of TextRank and 1.75 percentage points higher than that of BERTSUMExt(BERT SUMmarization Extractive).Experimental results show that the proposed model is effective in fusing product text and image information,which performs well on ROUGE evaluation index.
关 键 词:产品摘要 多模态摘要 抽取式摘要 多模态融合 自动文摘
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7