基于对话回合衰减的cache语言模型在线自适应研究  被引量:1

Study of Dialog Turn-Based Decaying Cache Adaptation Model

在线阅读下载全文

作  者:何伟[1] 李红莲[1] 袁保宗[1] 林碧琴[1] 

机构地区:[1]北方交通大学信息所,北京100044

出  处:《中文信息学报》2003年第5期41-47,共7页Journal of Chinese Information Processing

基  金:国家"97 3"项目资助 (G1 9980 3 0 50 1 1 )

摘  要:目前由于特定任务域语料的稀疏并且难以收集,这严重阻碍了对话系统的可移植性。如何利用在线收集的少量训练语料,实现语言模型的快速自适应,从而有效提高对话系统在新任务域的识别率是本文的目的所在。本文对传统cache模型修正后,提出了基于历史单元衰减的cache语言模型,以在线递增方式收集语料进行自适应,并与通用语言模型进行线性插值。在对话系统中,以对话回合为历史单元,也可称为基于对话回合衰减的cache语言模型。在两个完全不同任务域———颐和园导游与火车票订票任务域进行的实验表明,在自适应语料不到1千句时,与无自适应模型相比,有监督模式下的识别错误率分别降低了47 8%和74 0%,无监督模式下的识别错误率分别降低了30 1%和51 1%。The substantial investment required for developing a spoken language system in each specific task is a hamper to the widespread use of speech technology. In this paper, to develop the toolkits for porting a spoken language system to a new application rapidly and simply, an improved cache modela history unit based decaying cache model is provided for online language model adaptation of spoken language systems. To capture the dialog state change, each user's utterance and system response are collected and trained. When each dialog turn finished, the cache is updated and bigram counts would be decimal after decaying. The cache bigram is interpolated with the generic trigram. Experiments are performed on two contrastive tasks: the train travel reservation and the park guide. When the training data just arrived to several hundred utterances, in both tasks there is a satisfying reduction in character error rate for both supervised and unsupervised adaptation.

关 键 词:计算机应用 中文信息处理 口语对话系统 语言模型 cache自适应 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象