搜索引擎日志中“N+V+N”、“V+N+N”型短语识别被引量：1

“N+V+N” 、 “V+N+N” structure phrase recognition in search engine query logs

出　　处：《计算机工程与应用》2013年第6期143-147,155,共6页Computer Engineering and Applications

基　　金：国家社会科学基金项目(No.09CYY021)

摘　　要：短语识别是进行短语分析的前期准备工作。针对搜索引擎日志中"N+V+N"、"V+N+N"型短语特点,采用最大熵方法,按词信息、词性信息、音节数及前位标记信息提取特征构建训练集,得到最大熵方法进行短语识别的机器学习模型。实验结果显示,利用最大熵方法对两种短语进行开放性测试,两种短语的识别F值分别达到85.78%和76.47%,取得了较好的自动识别效果,在半开放性测试中,其识别结果更佳。The phrase recognition is the period preparatory work before carrying on the phrase analysis. This paper in view of the characteristics of ＂N＋V＋N＂ ,＂V＋N＋N＂ structure phrase in search engine query logs of the corpus, uses a method of maxi- mum entropy to get the machine learning model for phrase recognition according to the word information, the part of speech in- formation, the number of syllable, anterior tags. Experimental results of the open tests show better performances： F_value of ＂N＋ V＋N＂ 85.78% and F value of＂V＋N＋N＂ 76.47%. In the semi open tests, the experiment result is better.

关键词：短语识别搜索引擎日志“ N+V+N”“ V+N+N” 最大熵方法

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

搜索引擎日志中“N+V+N”、“V+N+N”型短语识别被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

搜索引擎日志中“N+V+N”、“V+N+N”型短语识别 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

搜索引擎日志中“N+V+N”、“V+N+N”型短语识别被引量：1