轻量化汉语唇读模型及数据集构建

Lightweight Chinese Lipreading Model and Dataset

作　　者：孙保胜谢东亮 SUN Baosheng;XIE Dongliang(School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China)

机构地区：[1]北京邮电大学计算机学院,北京100876

出　　处：《北京邮电大学学报》2023年第4期58-63,共6页Journal of Beijing University of Posts and Telecommunications

摘　　要：为了促进汉语唇读的快速发展和实际应用,提出了一种基于交错组卷积和空洞卷积组合的轻量化唇读模型。所提模型通过分组卷积学习不同特征间的相关性,通过空洞卷积扩展模型视野,在大幅度降低模型参数量和复杂度的同时提高模型识别精度。针对汉语唇读数据集较少的问题,在可控制环境下录制了一个句子级汉语唇读数据集。在录制数据集和公开数据集上对轻量化唇读模型适用性进行实验验证,证明了模型的有效性。并通过热图可视化的方法分析了模型对视频帧和文本映射关系的学习能力。In order to promote the rapid development and practical application of Chinese lipreading,a lightweight lipreading model is proposed based on the combination of interleaved group convolution and dilated convolution.In the proposed model,the interleaved group convolution is taken to learn the correlation between different features and the dilated convolution is taken to expand the model receptive field,which greatly reduces the amount and complexity of model parameter and improves the accuracy of model recognition.In addition,the largest sentence-level Chinese lipreading dataset is recorded in a controlled environment to enrich the Chinese lipreading dataset.The applicability of the lightweight lipreading model is verified on the recorded datasets and public datasets.The learning ability of the model to the video frame and text mapping relationship is analyzed visually through the heatmap.

关键词：汉语唇读轻量化交错组卷积空洞卷积

分类号：TN911.73[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

轻量化汉语唇读模型及数据集构建

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

轻量化汉语唇读模型及数据集构建

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索