检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杜丽娜 杨硕[1,2] 卓力 张菁[1,2] 李嘉锋 DU Lina;YANG Shuo;ZHUO Li;ZHANG Jing;LI Jiafeng(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China;Beijing Key Laboratory of Computational Intelligence and Intelligent System,Beijing University of Technology,Beijing 100124,China)
机构地区:[1]北京工业大学信息学部,北京100124 [2]北京工业大学计算智能与智能系统北京市重点实验室,北京100124
出 处:《北京工业大学学报》2024年第1期18-26,共9页Journal of Beijing University of Technology
基 金:国家自然科学基金资助项目(61531006);北京市自然科学基金资助项目(KZ201910005007)。
摘 要:考虑到卡顿、质量切换、内容特征等因素对用户体验质量的影响都会直接体现在客户端的失真视频里,提出了一种客户端的移动视频感知质量评价模型。该模型无须对每种影响因素均进行表征和度量,而是基于深度特征提取+回归的思路,直接建立失真视频与平均意见分数之间的映射模型。首先,构建了ResNet-TSM网络结构,提取失真视频片段的深度时空特征;为了避免维度灾难,采用LargeVis算法对提取的深度特征进行降维,同时提升特征的表达与区分能力。然后,采用双向门控循环单元网络对视频的长时间依赖关系进行建模,得到各视频片段的打分,再利用时间平均池化方法将各片段分数进行聚合,得到整个视频的打分结果。在WaterlooSQoE-Ⅲ和LIVE-NFLX-Ⅱ数据集上的实验结果表明,提出的模型可以获得更高的预测精度。Considering the effects of stalling,quality switching,content characteristics and other factors,which will be directly reflected in the distorted video,a client-oriented mobile video perceptual quality assessment model was proposed.The mapping model between the distorted video and the mean opinion score(MOS) was established based on the idea of “deep feature extraction+regression” instead of characterizing and measuring each influencing factor.First,ResNet-TSM network was constructed to extract the deep spatial-temporal features of each distorted video segmentation.Second,LargeVis algorithm was used to reduce the dimensionality of the extracted deep features,and simultaneously improving the representation and discriminative capabilities of the features.Afterward,the score of each video segment was obtained by modeling the long-term dependence of the video by using the bidirectional gated recurrent unit.The temporal mean pooling was adopted to aggregate the scores of each segment to obtain the overall video score.The experimental results on the WaterlooSQoE-Ⅲ and LIVE-NFLX-Ⅱ datasets show that the proposed model can achieve a higher prediction accuracy.
关 键 词:视频感知质量评价 平均意见分数 卷积神经网络 时间移位模块 双向门控循环单元 深度时空特征
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49