检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《通信学报》2006年第2期113-118,共6页Journal on Communications
基 金:国家自然科学基金资助项目(60402029)~~
摘 要:针对音频检索任务中的关键词检索提出一种新的基于拼音图的两阶段检索系统,可以高效地从大量语音数据中检索出感兴趣的文本信息,从而达到为国家安全服务的目的。该系统分为预处理阶段和检索阶段。预处理阶段将语音数据识别成具有高覆盖率的拼音图,在这一过程中通过若干次的无监督最大似然线性回归自适应算法渐次提高拼音图的质量。检索阶段响应用户的频繁查询,只需在拼音图中查找出与关键词拼音匹配的拼音串,并采用基于N元拼音文法的前后向算法计算置信度以实现对检索结果的筛选。实验表明:系统具有较高的召回率和正确率,且检索阶段仅需0.01倍实时,可以满足快速检索的需要。A new two-stage keyword spotting system was proposed based on syllable graph for audio information retrieval task, which could efficiently spot the interesting words in mass speech data, thus serve for the national security. It comprised two stages - preprocessing and searching. In the preprocessing stage, the audio data was recognized into syllable graph which included high accuracy syllable candidates, and unsupervised MLLR (maximum likelihood linear regression) adaptation was carded out iteratively to further improve the accuracy of the syllable graph. In the searching stage, to answer the frequent queries from users, searching for matched keywords was only scanned in the graph for likely syllable strings. A forward-backward algorithm based on syllable N-grammar was used to calculate confidence measures for further filtering of the searching result. Experimental results show the system achieved good performances in both recall rate and accuracy rate, and in the searching stage only 0.01 times of real time is needed, which can meet the demand for fast retrieval.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30