多模态大语言模型对开源声像信息研究的影响  

The impact of multi-modal large language models on open-source audio-visual information research

在线阅读下载全文

作  者:吴叔義 郭秀峰 侯丽 WU Shuyi;GUO Xiufeng;HOU Li(Military Science Information Research Center,Academy of Military Science,Beijing 100142,China)

机构地区:[1]军事科学院军事科学信息研究中心,北京100142

出  处:《国防科技》2024年第3期73-80,92,共9页National Defense Technology

摘  要:开源声像信息研究作为国防科技信息研究的组成部分,在自媒体与短视频爆发的现阶段重要性愈发凸显。大模型浪潮爆发后,深入探析多模态大语言模型对开源声像信息研究工作的影响具有重要意义。通过研究梳理多种多模态大语言模型技术特点和应用场景特点,提出在开源声像信息研究中的潜在应用方向,为开源声像信息研究工作提供参考。现阶段多模态大语言模型距离直接落地应用还有差距,但其将是重塑重构声像信息研究工作的重要推手,其生成特性也为开源声像信息研究带来极大挑战,开源声像信息研究进入转型升维的战略机遇期。Open-source audio-visual information research,as a component of defense technology information research,has become increasingly significant in the current era of social media and short video explosions.Following the surge of large model technology,it is of great significance to deeply analyze the impact of multimodal large language models on open-source audio-visual information research work.By studying and organizing the technical characteristics and application scenarios of various multimodal large language models,potential application directions in open-source audio-visual information research are proposed,providing a reference for the research work in this field.At present,there is still a gap for multimodal large models to be directly applied,but multimodal large language models will be an important driver in reshaping and reconstructing the work of audio-visual information research.Their generative characteristics also pose significant challenges to open-source audio-visual information research.Open-source audio-visual information research has entered a strategic period of transformation and upgrading.

关 键 词:多模态大语言模型 开源声像信息 人工智能 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象