利用博客链接平台选取联合关键字的博客聚类方法  被引量:2

Blog clustering method based on selection of joint keywords using blog connect platform

在线阅读下载全文

作  者:王琦[1] 霍纬纲[2] 

机构地区:[1]运城学院计算机科学与技术系,山西运城044000 [2]中国民航大学计算机科学与技术学院,天津300300

出  处:《计算机应用研究》2017年第12期3560-3563,3588,共5页Application Research of Computers

基  金:国家自然科学青年基金资助项目(61301245)

摘  要:针对全文本关键字检索的时间成本高、采用标签/类别会产生语句歧义和同义词等问题,提出在博客链接平台上选取联合关键字进行博客聚类。假设一个博客文章被查询的候选关键字(或者联合关键字)可以用于表示这个博客文章的主题,为验证该假设,首先将跟踪代码嵌入到博客链接(BC)组件中,以收集读者查询的关键字;然后,选取适当的候选关键字作为联合关键字;最后,使用重叠投影、交互信息投影、分布式分布信息和肯德尔τ系数这四种相似性度量以验证BC组件提取的联合关键字。实验结果表明,提出的方法可以为查询者提供一条找到对应博客的快速通道;此外,生成的联合关键字可以减少全文本关键字检索过程的复杂度和冗余度,很好地满足了博客用户的需求。Concerning that the time cost of full-text keyword searching is high,and the label/category statement will produce ambiguity and synonyms problems,this paper proposed a way to select joint keywords in the blog connect platform for blog clustering. This method assumed that the candidate keywords( or joint keyword) of a blog post by querying could be used to represent the theme of this blog. In order to verify this assumption,firstly,it embedded a tracing code in blog connect so as to collect the keywords queried by readers. Then,it used FKRP to select candidate keywords as co-keywords. Finally,it used the similarity measures,including overlapping projection,mutual information projection,distributed information and the Kendall τ coefficient to validate the BC component extraction. The experimental results show that the proposed method can provide a fast channel for the query to find the corresponding blog. In addition,the joint key generation can reduce the search process' s complexity and redundancy,which can well meet the needs of blog users.

关 键 词:关键字提取 博客链接平台 博客聚类 联合关键字 相似性度量 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象