用于腹腔镜扶持器控制的特定人语音识别算法被引量：3

Speaker-Dependent Speech Recognition Algorithm for Laparoscopic Supporter Control

作　　者：任凯龙汪毅[1] 陈晓冬[1] 蔡怀宇[1] Ren Kailong;Wang Yi;Chen Xiaodong;Cai Huaiyu(School of Pvecision Instruments and Optoelectronwics Eagineering,Tianjin University,Tianjiu 300072,China)

机构地区：[1]天津大学精密仪器与光电子工程学院,天津300072

出　　处：《激光与光电子学进展》2020年第18期374-382,共9页Laser & Optoelectronics Progress

摘　　要：提出了一种基于融合i-vector特征的长短时记忆(LSTM)循环神经网络模型,用于腹腔镜扶持器语音控制,在小训练样本下实现对特定医生语音中的短时、孤立词指令的识别。该模型以LSTM循环神经网络作为基础模型,以梅尔频率倒谱系数(MFCC)作为输入特征参数,将i-vector特征作为LSTM循环神经网络的深层输入信息,与神经网络中LSTM层后的深层特征信息进行拼接,达到参数融合的目的,实现对特定主刀医生语音指令的准确识别以及对非主刀医生语音指令的拒识别,为腹腔镜操作提供安全智能的语音识别方案。使用自建语音库进行实验,分别验证所提算法对训练库内语音的识别性能以及对训练库外语音的拒识别性能。实验结果表明:与动态时间规整算法(DTW)和混合高斯模型-隐马尔可夫模型(GMM-HMM)相比,所提模型在对训练库内特定人语音指令识别正确率高达99.6%的同时保持着错误接受率为0%,对训练库外语音的平均错误接受率为2.5%,满足腹腔镜扶持器控制的准确性和安全性要求。A long short-term memory(LSTM)recurrent neural network based on an i-vector feature is presented for speech control of laparoscopic supporter to realize short-term isolated word command recognition from the speech of a specific doctor using small training samples.In this model,LSTM recurrent neural network is used as the basic model,Mel-frequency cepstrum coefficient(MFCC)is used as the input characteristic parameter,i-vector feature is used as the deep input information of LSTM recurrent neural network,and the deep feature information behind LSTM layer in the neural network is spliced to achieve the purpose of parameter fusion,so as to realize the accurate recognition of the voice instructions of the specific surgeon and the rejection recognition of the voice instructions of the non surgeon.This approach offers a secure and intelligent speech recognition scheme for laparoscopic surgeries.Further,a self-built speech database is used as a training library to verify speech recognition performance of the proposed algorithm as well as its rejection performance for the speech not included in the training library.Experiments show that compared with dynamic time warping(DTW)and Gaussian mixture model-Hidden Markov model(GMM-HMM),the proposed model exhibits a 99.6%correct recognition rate for voice commands of specific people recorded in the training library while maintaining a false acceptance rate of 0%,with an average false acceptance rate of 2.5%for voices not included in the training library.The proposed model meets the requirements of accuracy and safety expected by laparoscopic supporter control standards.

关键词：医用光学腹腔镜 i-vector 长短时记忆特定人语音识别

分类号：TN912[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

用于腹腔镜扶持器控制的特定人语音识别算法被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

用于腹腔镜扶持器控制的特定人语音识别算法 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

用于腹腔镜扶持器控制的特定人语音识别算法被引量：3