基于ResNet-TSM和BiGRU网络的移动视频感知质量评价模型被引量：1

Mobile Video Perceptual Quality Assessment Model With ResNet-TSM and BiGRU Network

作　　者：杜丽娜杨硕[1,2] 卓力张菁[1,2] 李嘉锋 DU Lina;YANG Shuo;ZHUO Li;ZHANG Jing;LI Jiafeng(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China;Beijing Key Laboratory of Computational Intelligence and Intelligent System,Beijing University of Technology,Beijing 100124,China)

机构地区：[1]北京工业大学信息学部,北京100124 [2]北京工业大学计算智能与智能系统北京市重点实验室,北京100124

出　　处：《北京工业大学学报》2024年第1期18-26,共9页Journal of Beijing University of Technology

基　　金：国家自然科学基金资助项目(61531006);北京市自然科学基金资助项目(KZ201910005007)。

摘　　要：考虑到卡顿、质量切换、内容特征等因素对用户体验质量的影响都会直接体现在客户端的失真视频里,提出了一种客户端的移动视频感知质量评价模型。该模型无须对每种影响因素均进行表征和度量,而是基于深度特征提取+回归的思路,直接建立失真视频与平均意见分数之间的映射模型。首先,构建了ResNet-TSM网络结构,提取失真视频片段的深度时空特征;为了避免维度灾难,采用LargeVis算法对提取的深度特征进行降维,同时提升特征的表达与区分能力。然后,采用双向门控循环单元网络对视频的长时间依赖关系进行建模,得到各视频片段的打分,再利用时间平均池化方法将各片段分数进行聚合,得到整个视频的打分结果。在WaterlooSQoE-Ⅲ和LIVE-NFLX-Ⅱ数据集上的实验结果表明,提出的模型可以获得更高的预测精度。Considering the effects of stalling,quality switching,content characteristics and other factors,which will be directly reflected in the distorted video,a client-oriented mobile video perceptual quality assessment model was proposed.The mapping model between the distorted video and the mean opinion score(MOS) was established based on the idea of “deep feature extraction+regression” instead of characterizing and measuring each influencing factor.First,ResNet-TSM network was constructed to extract the deep spatial-temporal features of each distorted video segmentation.Second,LargeVis algorithm was used to reduce the dimensionality of the extracted deep features,and simultaneously improving the representation and discriminative capabilities of the features.Afterward,the score of each video segment was obtained by modeling the long-term dependence of the video by using the bidirectional gated recurrent unit.The temporal mean pooling was adopted to aggregate the scores of each segment to obtain the overall video score.The experimental results on the WaterlooSQoE-Ⅲ and LIVE-NFLX-Ⅱ datasets show that the proposed model can achieve a higher prediction accuracy.

关键词：视频感知质量评价平均意见分数卷积神经网络时间移位模块双向门控循环单元深度时空特征

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于ResNet-TSM和BiGRU网络的移动视频感知质量评价模型被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于ResNet-TSM和BiGRU网络的移动视频感知质量评价模型 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于ResNet-TSM和BiGRU网络的移动视频感知质量评价模型被引量：1