An Effective Machine Learning Approach with Hyperparameter Tuning for Sentiment Analysis  

在线阅读下载全文

作  者:Saima Kanwal Ali Raza Chunyan Bai Dawei Zhang Jing Wen Dileep Kumar 

机构地区:[1]Engineering Research Centre of Optical Instrument and Systems,Ministry of Education and Shanghai Key Lab of Modern Optical System,University of Shanghai for Science and Technology,No.516 Jun Gong Road,Shanghai 200093,China [2]Department of Mathematics,School of Sciences,Hebei University of Technology,Beichen Campus,Tianjin 300401,China [3]Shanghai Publishing and Printing College,Shanghai 200093,China [4]Zhangjiang Laboratory,Shanghai 200120,China [5]Department of Electronic Engineering,Faculty of Engineering,The Islamia University of Bahawalpur,63100,Punjab,Pakistan

出  处:《Data Intelligence》2025年第1期70-94,共25页数据智能(英文)

基  金:supported by the National Key R&D Program of China(2018YFA0701800);the National Natural Science Foundation of China(NSFC,project no.62175153).

摘  要:Sentiment analysis depends on individuals’comments and opinions on events.Data from social media platforms like Twitter,Quora,or Facebook poses challenges due to informal language,including acronyms,misspellings,and ambiguous terms.Additionally,hyperparameters in machine learning models significantly impact performance.To address these issues,we propose advanced feature engineering techniques in Natural Language Processing(NLP)and hyperparameter optimization to enhance prediction accuracy and generalization capabilities.Our study employs Naïve Bayes,Logistic Regression(LR),Multi-layer Perceptron(MLP),and Support Vector Machine(SVM)to classify sentiments in tweets about Elon Musk’s potential acquisition of Twitter.The dataset,consisting of 100,000 tweets,is fetched using the Twitter representational state transfer application programming interface(REST API).We outline a sentiment analysis procedure to classify unstructured Twitter data,identify influential keywords,and categorize sentiments as Positive,Negative,or Neutral.Using a hybrid Lexicon NLP approach,we extract contextually significant emotionally charged words and assign sentiment polarities.Hyperparameter optimization via automated search methods ensures alignment with classifier performance estimates.SVM achieved an impressive accuracy rate of 97%.Cross-validation minimizes random variations,providing a reliable assessment of the model’s generalization capabilities,and demonstrating the method’s accuracy in predicting sentiments with larger new unseen standard datasets,and varying sentiment.

关 键 词:Hybrid lexicon Logistic regression(LR) Machine learning Natural language processing(NLP) 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象