检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘建伟[1] 丁熙浩 罗雄麟[1] Liu Jianwei;Ding Xihao;Luo Xionglin(Dept.of Automation,China University of Petroleum,Beijing 102249,China)
机构地区:[1]中国石油大学(北京)自动化系,北京102249
出 处:《计算机应用研究》2020年第6期1601-1614,共14页Application Research of Computers
摘 要:在多模态深度学习发展前期总结当前多模态深度学习,发现在不同多模态组合和学习目标下,多模态深度学习实现过程中的共有问题,并对共有问题进行分类,叙述解决各类问题的方法。具体来说,从涉及自然语言、视觉、听觉的多模态学习中考虑了语言翻译、事件探测、信息描述、情绪识别、声音识别和合成以及多媒体检索等方面研究,将多模态深度学习实现过程中的共有问题分为模态表示、模态传译、模态融合和模态对齐四类,并对各类问题进行子分类和论述,同时列举了为解决各类问题产生的神经网络模型。最后论述了实际多模态系统、多模态深度学习研究中常用的数据集和评判标准,并展望了多模态深度学习的发展趋势。This paper aimed to summarize the current multimodal deep learning,found common problems in the implementation of multimodal deep learning under different multimodal and learning objectives,as well as made common problems classify and described methods for solving various problems at the early development of multimodal deep learning. Specifically,this paper summarized the current multimodal deep learning that studied on natural language,visual,auditory,and considered the research direction such as language translation,event detection,information description,emotion recognition,voice recognition and synthesis,and multimedia retrieval and so on,which further concluded that there were four types of common problems: multimodal representation,multimodal interpretation,multimodal fusion and multimodal alignment. Meanwhile,this paper sub-categorized and discussed each common multimodal learning problem,and listed the neural network models generated for solving the problems. Finally,it introduced some actual multimodal system,listed baseline datasets and evaluation criteria used in multimodal deep learning,and prospected the development directions for future research.
关 键 词:多模态 深度学习 多神经网络 多模态表示 多模态传译 多模态融合 多模态对齐
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222