基于双重语义空间的农业信息检索模型研究  被引量:2

Research on Agricultural Information Retrieval Model Based on Double Semantic Space

在线阅读下载全文

作  者:陈燕红[1] 张太红[1] 冯向萍[1] 白涛[1] 马健[1] 

机构地区:[1]新疆农业大学计算机与信息工程学院,乌鲁木齐830052

出  处:《新疆农业大学学报》2012年第3期253-258,共6页Journal of Xinjiang Agricultural University

基  金:新疆维吾尔自治区科技攻关项目(200931103)

摘  要:为了提高针对大规模农业信息的语义检索性能,提出一种基于改进的随机索引语义空间和潜在语义空间的农业信息检索模型(IRI&LSA)。利用120万张中文网页和2 000张分为4类的小规模中文农业网页,对IRI&LSA和两种分别基于单向量兰克泽斯算法(LAS2)和半离散矩阵分解算法(SDD)的常用潜在语义检索模型(LSA-LAS2和LSA-SDD)进行了对比实验。结果表明,IRI&LSA检索结果的平均F1值可达83%,明显高于LSA-LAS2(71%)和LSA-SDD(64%);IRI&LSA的检索速度分别是LSA-LAS2和LSA-SDD的3.6倍和4.9倍。研究结果表明,IRI&LSA适合应用于较大规模农业信息检索。In order to improve semantic retrieval function of massive agricultural information, an agricul- tural information search modle (IRI&LSA) was proposed,based on improved radom index semantic space and latent sematic space. The contrast experiments were conducted between IRIg&LSA and two commonly used latent semantic models (LSA-LAS2 and LSA-SDD) by using 1.2 million Chinese web pages and 2 000 Chinese agricultural web pages that were divided into four categories,based on single-vector lanczos algo- rithm (LAS2) and semi-discrete matrix decomposition algorithm (SDD) respectively. These results showed that the average F1 value of search results of IRI&LSA reached 830//00 that was significantly higher than LSA-LAS2(71%) and LSA-SDD (64 %); retrieval speed of IRI&LSA was LSA-LAS2's 3.6 times and LSA-SDD's 4.9 times. Experimental results showed that IRIgaLSA was suitable for massive agricul- tural information retrieval.

关 键 词:农业信息检索 随机索引 潜在语义分析 IRI&LSA 

分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象