Comparative Analysis of Machine Learning Algorithms for Email Phishing Detection Using TF-IDF, Word2Vec, and BERT  

在线阅读下载全文

作  者:Arar Al Tawil Laiali Almazaydeh Doaa Qawasmeh Baraah Qawasmeh Mohammad Alshinwan Khaled Elleithy 

机构地区:[1]Faculty of Information Technology,Applied Science Private University,Amman,11931,Jordan [2]College of Engineering,Abu Dhabi University,Abu Dhabi,Al Ain,P.O.Box 1790,United Arab Emirates [3]Faculty of Artificial Intelligence,Al-Balqa Applied University,Salt,19117,Jordan [4]Department of Civil and Construction Engineering,Western Michigan University,Kalamazoo,MI 49008,USA [5]MEU Research Unit,Middle East University,Amman,11831,Jordan [6]Department of Computer Science and Engineering,University of Bridgeport,Bridgeport,CT 06604,USA

出  处:《Computers, Materials & Continua》2024年第11期3395-3412,共18页计算机、材料和连续体(英文)

基  金:supported by Abu Dhabi University。

摘  要:Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems.

关 键 词:ATTACKS email phishing machine learning security representations from transformers(BERT) text classifeir natural language processing(NLP) 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象