修正中文评论挖掘中产品特征词序的实验研究  被引量:2

Experimental Research on Correcting the Words Sequence of Product Features Extracted from Chinese Reviews

在线阅读下载全文

作  者:李实[1] 陆光[1] 

机构地区:[1]东北林业大学信息与计算机工程学院,哈尔滨150040

出  处:《科学技术与工程》2012年第21期5181-5186,共6页Science Technology and Engineering

基  金:国家自然科学基金(71001023);中央高校基本科研业务费专项资金(DL11BB25)资助

摘  要:目前互联网已经成为信息和观点的交换主要媒介,因此也成为了手机用户对于产品观点的最佳来源。但是目前为止研究中文文本的评论挖掘问题的研究还比较少。为了进一步发展这一领域的研究,旨在从中文客户评论中得到用户关心的产品特征。方法基于关联规则理论中的Apriori算法。主要通过计算频繁特征项的各分量在文本中出现位置的概率,从而确定挖掘到的候选产品特征中词汇的语序,使挖掘结果满足中文的正规语法要求。采用因特网上的评论数据作为语料,通过实验结果表明所提出的方法使得中文评论中的产品特征挖掘性能有所提高。The Internet become used as a main medium for exchange of information and opinions, so Web has become an excellent source for gathering consumer opinions about products. However, up to now there are very few researches conducted on online reviews mining for Chinese text. In order to remedy this deficiency how to automatically mine product features is studied. The proposed method based on Apriori algorithm in the theory of association rules. The method computed the location probability value of words in frequent itemsets appeared in sentences, and then corrected the words sequence of the candidate product features. This made the mining results meet the requirements of standard syntax in Chinese language. The customer reviews from several popular website as the corpus dataset, and experimental findings indicated that the proposed method improves the performance of the product features extraction from Chinese customer reviews are downloaded.

关 键 词:产品特征 WEB挖掘 情感分析 用户评论 

分 类 号:TP391.3[自动化与计算机技术—计算机应用技术] TP311[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象