一种高稳定性词汇共现模型  被引量:2

A Highly Stable Term Co-Occurrence Model

在线阅读下载全文

作  者:乔亚男[1] 齐勇[1] 侯迪[1] 

机构地区:[1]西安交通大学计算机科学与技术系,西安710049

出  处:《西安交通大学学报》2009年第6期24-27,共4页Journal of Xi'an Jiaotong University

基  金:国家高技术研究发展计划资助项目(2006AA01Z101);教育部高等学校博士学科点专项科研基金资助项目(20060698018).

摘  要:针对传统词汇共现模型存在的缺乏理论基础和稳定性欠佳等问题,提出了一种基于项场的高稳定性词汇共现模型.借鉴经典物理学中场的概念给出了项场的定义,其中项是语言的基本单位,是概念的抽象描述,而项场则是项在文档中的影响范围.在此基础上,引入量子场论将项与项的相关度类比为项场的叠加,由此给出了项与项之间距离和相关度的函数关系,并用其建立了词汇共现模型.实验结果证明,在小距离的情况下,所提模型中项的相关度大体呈常数,具有一定的窗口内稳定性,而同范畴的项对相关度振幅只有对照模型中最小振幅的26%,表明它具有较好的数据集稳定性.To address the issues that traditional term co-occurrence models are lack of theoretical basis and poor stabile, a highly stable term co-occurrence model based on term field is proposed. The model uses the concept of field in classical physics for reference to define the term field (terms are the basic units of language, which describe the abstract concepts, and the term field is the area affected by a term in document). Based on the definition, the model regards correlation as a superposition of term fields, and gets the functional relations of terms correlation and the distances between terms. Experimental results show that the terms correlation in this model is almost a constant for small distances and stable enough in window. While the correlation amplitude of the terms in same category is only 26% of the best result obtained with other models, which means the model is stable enough in dataset.

关 键 词:项场 词汇共现 窗口内稳定性 数据集稳定性 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象