Performance analysis of new word weighting procedures for opinion mining  被引量:2

Performance analysis of new word weighting procedures for opinion mining

在线阅读下载全文

作  者:G.R.BRINDHA P.SWAMINATHAN B.SANTHI 

机构地区:[1]School of Computing, SASTRA University

出  处:《Frontiers of Information Technology & Electronic Engineering》2016年第11期1186-1198,共13页信息与电子工程前沿(英文版)

摘  要:The proliferation of forums and blogs leads to challenges and opportunities for processing large amounts of information. The information shared on various topics often contains opinionated words which are qualitative in nature. These qualitative words need statistical computations to convert them into useful quantitative data. This data should be processed properly since it expresses opinions. Each of these opinion bearing words differs based on the significant meaning it conveys. To process the linguistic meaning of words into data and to enhance opinion mining analysis, we propose a novel weighting scheme, referred to as inferred word weighting(IWW). IWW is computed based on the significance of the word in the document(SWD) and the significance of the word in the expression(SWE) to enhance their performance. The proposed weighting methods give an analytic view and provide appropriate weights to the words compared to existing methods. In addition to the new weighting methods, another type of checking is done on the performance of text classification by including stop-words. Generally, stop-words are removed in text processing. When this new concept of including stop-words is applied to the proposed and existing weighting methods, two facts are observed:(1) Classification performance is enhanced;(2) The outcome difference between inclusion and exclusion of stop-words is smaller in the proposed methods, and larger in existing methods. The inferences provided by these observations are discussed. Experimental results of the benchmark data sets show the potential enhancement in terms of classification accuracy.The proliferation of forums and blogs leads to challenges and opportunities for processing large amounts of information. The information shared on various topics often contains opinionated words which are qualitative in nature. These qualitative words need statistical computations to convert them into useful quantitative data. This data should be processed properly since it expresses opinions. Each of these opinion bearing words differs based on the significant meaning it conveys. To process the linguistic meaning of words into data and to enhance opinion mining analysis, we propose a novel weighting scheme, referred to as inferred word weighting(IWW). IWW is computed based on the significance of the word in the document(SWD) and the significance of the word in the expression(SWE) to enhance their performance. The proposed weighting methods give an analytic view and provide appropriate weights to the words compared to existing methods. In addition to the new weighting methods, another type of checking is done on the performance of text classification by including stop-words. Generally, stop-words are removed in text processing. When this new concept of including stop-words is applied to the proposed and existing weighting methods, two facts are observed:(1) Classification performance is enhanced;(2) The outcome difference between inclusion and exclusion of stop-words is smaller in the proposed methods, and larger in existing methods. The inferences provided by these observations are discussed. Experimental results of the benchmark data sets show the potential enhancement in terms of classification accuracy.

关 键 词:Inferred word weight Opinion mining Supervised classification Support vector machine(SVM) Machine learning 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TP391.1[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象