Feature Engineering Methods for Analyzing Blood Samples for Early Diagnosis of Hepatitis Using Machine Learning Approaches  

在线阅读下载全文

作  者:Mohamed A.G.Hazber Ebrahim Mohammed Senan Hezam Saud Alrashidi 

机构地区:[1]Department of Information and Computer Science,College of Computer Science and Engineering,University of Ha’il,Hail,81481,Saudi Arabia [2]Department of Computer Science,College of Applied Sciences,Hajjah University,Hajjah,9677,Yemen [3]Department of Artificial Intelligence,Faculty of Computer Science and Information Technology,Al-Razi University,Sana’a,9671,Yemen

出  处:《Computer Modeling in Engineering & Sciences》2025年第3期3229-3254,共26页工程与科学中的计算机建模(英文)

基  金:funded by Scientific Research Deanship at University of Ha’il,Saudi Arabia,through project number GR-24009.

摘  要:Hepatitis is an infection that affects the liver through contaminated foods or blood transfusions,and it has many types,from normal to serious.Hepatitis is diagnosed through many blood tests and factors;Artificial Intelligence(AI)techniques have played an important role in early diagnosis and help physicians make decisions.This study evaluated the performance of Machine Learning(ML)algorithms on the hepatitis data set.The dataset contains missing values that have been processed and outliers removed.The dataset was counterbalanced by the Synthetic Minority Over-sampling Technique(SMOTE).The features of the data set were processed in two ways:first,the application of the Recursive Feature Elimination(RFE)algorithm to arrange the percentage of contribution of each feature to the diagnosis of hepatitis,then selection of important features using the t-distributed Stochastic Neighbor Embedding(t-SNE)and Principal Component Analysis(PCA)algorithms.Second,the SelectKBest function was applied to give scores for each attribute,followed by the t-SNE and PCA algorithms.Finally,the classification algorithms K-Nearest Neighbors(KNN),Support Vector Machine(SVM),Artificial Neural Network(ANN),Decision Tree(DT),and Random Forest(RF)were fed by the dataset after processing the features in different methods are RFE with t-SNE and PCA and SelectKBest with t-SNE and PCA).All algorithms yielded promising results for diagnosing hepatitis data sets.The RF with RFE and PCA methods achieved accuracy,Precision,Recall,and AUC of 97.18%,96.72%,97.29%,and 94.2%,respectively,during the training phase.During the testing phase,it reached accuracy,Precision,Recall,and AUC by 96.31%,95.23%,97.11%,and 92.67%,respectively.

关 键 词:HEPATITIS machine learning PCA RFE SelectKBest t-SNE 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象