自注意力机制和随机森林回归的视频摘要生成  被引量:4

Video Summarization Generation Based on Self-attention Mechanism and Random Forest Regression

在线阅读下载全文

作  者:李雷霆 武光利 郭振洲[1] LI Leiting;WU Guangli;GUO Zhenzhou(School of Cyber Security,Gansu University of Political Science and Law,Lanzhou 730070,China;Key Laboratory of China’s Ethnic Languages and Information Technology of Ministry of Education,Northwest Minzu University,Lanzhou 730030,China)

机构地区:[1]甘肃政法大学网络空间安全学院,兰州730070 [2]西北民族大学中国民族语言文字信息技术教育部重点实验室,兰州730030

出  处:《计算机工程与应用》2022年第4期198-205,共8页Computer Engineering and Applications

基  金:甘肃省自然科学基金(20JR10RA334);兰州市人才创新创业项目(2020-RC-27);2021年陇原青年创新创业人才项目(2021LQGR20);甘肃政法大学校级重大科研创新项目(GZF2020XZDA03);甘肃省高等学校创新能力提升项目(2020B-167)。

摘  要:是通过生成关键帧或片段来达到压缩视频的效果,能够在概括视频主要内容的基础上极大缩短观看时间,在视频快速浏览与检索领域应用广泛。现有方法大多只基于图像内容进行探索,忽略了视频具有时序的特点,且模型对波动数据学习能力较差,导致生成的摘要缺乏时间连贯性和代表性。提出了一个以编码器-解码器为框架的视频摘要网络。具体来说,编码部分由卷积神经网络提取特征,通过自注意力机制提升对关键特征的权重,而解码部分由融合了随机森林的双向长短期记忆网络构成,通过调整随机森林和双向长短期记忆网络在损失函数中所占比例,使模型具有较强的稳定性和预测准确率。实验在两个数据集上与其他七种方法进行了比较,综合实验结果证明了方法的有效性与可行性。提出了自注意力机制和随机森林回归的视频摘要网络,利用自注意力机制完成对特征的优化,将双向长短期记忆网络与随机森林结合,提升模型的稳定性与泛化性,有效降低损失值,使得生成的视频摘要更符合用户视觉特性。Video summarization is compressed by generating key frames or fragments,which can greatly shorten the viewing time on the basis of summarizing the main content of the video,and is widely used in the field of video quick browsing and retrieval.Most existing methods only explore based on image content,ignoring the time-series feature of the video and the poor learning ability of the model to wave data,which leads to the lack of time coherence and representativeness of the generated summarization.This paper proposes a video summarization network based on encoder-decoder framework.In particular,the coding part extracts characteristics by the convolution neural network,uses the attention mechanism to improve the weight of key characteristics.And the decoding part is formed by fusing the random forest and bi-directional long short-term memory network,by adjusting the proportion of random forest and bi-directional long shortterm memory network in the loss function,the model has strong stability and prediction accuracy.Compared with the other seven methods on two datasets,the experimental results show that the proposed method is effective and feasible.This paper proposes the self-attention mechanism and random forest regression video summarization network to optimize the features by using the self-attention mechanism,and combines the bi-directional long short-term memory network with random forest to improve the stability and generalization of the model,effectively reduces the loss value,and makes the generated video summarization more consistent with the visual characteristics of users.

关 键 词:计算机视觉 视频摘要 自注意力机制 长短期记忆网络 随机森林回归 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象