基于身体姿态关键点检测及算法融合的连续手语识别  被引量:1

Continuous sign language recognition based on body pose key points detection and algorithm fusion

在线阅读下载全文

作  者:陈雅茜[1] 吴非 赵丁皓 CHEN Ya-xi;WU Fei;ZHAO Ding-hao(School of Computer Science and Engineering,Southwest Minzu University,Chengdu 610041,China)

机构地区:[1]西南民族大学计算机科学与工程学院,四川成都610041

出  处:《西南民族大学学报(自然科学版)》2023年第2期165-172,共8页Journal of Southwest Minzu University(Natural Science Edition)

基  金:“四川省科技计划资助”(2019YFH0055);四川省2021-2023年高等教育人才培养质量和教学改革项目(JG2021-401)。

摘  要:连续手语识别相对于单个手语词识别来说,更加具有研究意义也更加具有研究难度.连续手语识别需要更关注整体语句在时间上的依赖关系,以及语句中手语词结束与开始的时序分割问题.而对此的单个识别算法的研究与优化,短时间都很难再有较大的突破.因此,我们提出一种基于算法融合的连续手语识别方法,先通过帧间差分法处理关键帧,再通过MediaPipe检测并保存关键点数据,降低数据量,并提供有效、直接的数据;再通过CNN+BLSTM算法融合模型,让CNN专注局部感知,捕捉空间特征关系;BLSTM则侧重特征序列的时序建模,突出连续手语在时间纬度上的依赖关系.最后结合CTC完成标签和语句对齐问题.该算法在CSL数据集上取得了98.4%的平均识别率.Compared with single sign language word recognition,continuous sign language recognition is more meaningful and difficult for exploration.Continuous sign language recognition needs paying more attention to the time dependency of the whole sentence,as well as the temporal segmentation of the end and start of sign language words in the sentence.However,the research and optimization of single recognition algorithm can hardly make a breakthrough in a short time.Therefore,we proposed a con⁃tinuous sign language recognition method based on body pose key points detection and CNN+BLSTM algorithm fusion.First,key frames were processed by inter frame difference method,and then key point data for body pose was detected by using Media⁃Pipe method in order to reduce the amount of data,meanwhile providing effective and direct data.Then,CNN+BLSTM algo⁃rithm fusion model was used to make CNN focus on local perception and capture spatial feature relationship.BLSTM focused on time series modeling of feature sequences,highlighting the dependency of continuous sign language on time latitude.Finally,the issue of label and statement alignment was completed based on CTC.The algorithm achieved an average recognition rate of 98.4%on CSL dataset.

关 键 词:连续手语识别 深度学习 CNN BLSTM 身体姿态 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象