检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:袁甜甜[1] 杨学[1] YUAN Tian-tian;YANG Xue(Technical College for the Deaf/Tianjin University of Technology,Tianjin 300384,China)
出 处:《山东农业大学学报(自然科学版)》2021年第1期143-148,共6页Journal of Shandong Agricultural University:Natural Science Edition
基 金:天津市工业企业发展专项资金项目(201807111)。
摘 要:计算机视觉是目前我国新一代人工智能科技发展的重要方向,手语识别因其在连续性、复杂场景干扰等问题上的困难,导致其研究不仅可以解决听障人对无障碍信息沟通的真实需要,还可极大的促进视频理解及分析领域的快速发展,从而在安防、智能监控等方面也有很好的落地应用。通过比较国内外多种基于视频描述和分析的手势识别方法,给出了视频手语识别和基于深度学习的视频描述的策略分析。对使用原始视频帧、视频光流和目前先进的姿态估计技术等方法进行了比较,进而提出适用于中国手语视频数据的多模态描述策略、训练模型架构及时空注意力模型。使用具有深度信息辅助的视频描述及训练方法,通过实验验证BLEU-4值可达52.3,较前期使用的基础方法提高约20%。但由于该方法所使用的深度信息在现实情况下并不容易获得,因此研究由手机或电脑摄像头获取的普通RGB视频的描述及识别方法是未来的发展方向。Computer vision is an important direction in the development of new generation Artificial Intelligence technology in our country at present.Because of its difficulties in continuity and complex scene interference,the research of sign language recognition can not only solve the real needs of deaf people for barrier-free information communication,but also greatly promote the rapid development of video understanding and analysis,so it has a good landing application in security,intelligent monitoring and so on.By comparing many gesture recognition methods based on video description and analysis,the strategies of sign language recognition and video description based on depth learning are given.The methods of using original video frame,video optical stream and advanced attitude estimation technology are compared,and then a multi-modal description strategy suitable for Chinese sign language video is proposed,and the training model architecture and attention model are proposed.Using the video description and training method assisted by depth information,the experimental results show that the BLEU-4 value can reach 52.3,which is about 20%higher than that of the baseline method.However,because the depth information used in this method is not easy to obtain in reality,it is the future direction to study the description and recognition method of ordinary RGB video obtained by mobile phone or computer camera.
分 类 号:TP387[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145