Harmonization of Heart Disease Dataset for Accurate Diagnosis:A Machine Learning Approach Enhanced by Feature Engineering  

在线阅读下载全文

作  者:Ruhul Amin Md.Jamil Khan Tonway Deb Nath Md.Shamim Reza Jungpil Shin 

机构地区:[1]Department of Computer Science and Engineering,Metropolitan University,Sylhet,3104,Bangladesh [2]Department of Statistics,Pabna University of Science and Technology,Pabna,6600,Bangladesh [3]Department of Computer Science & Engineering,University of Aizu,Aizu-Wakamatsu,Fukushima,956-8580,Japan

出  处:《Computers, Materials & Continua》2025年第3期3907-3919,共13页计算机、材料和连续体(英文)

基  金:supported by the Competitive Research Fund of the University of Aizu,Japan(Grant No.P-13).

摘  要:Heart disease includes a multiplicity of medical conditions that affect the structure,blood vessels,and general operation of the heart.Numerous researchers have made progress in correcting and predicting early heart disease,but more remains to be accomplished.The diagnostic accuracy of many current studies is inadequate due to the attempt to predict patients with heart disease using traditional approaches.By using data fusion from several regions of the country,we intend to increase the accuracy of heart disease prediction.A statistical approach that promotes insights triggered by feature interactions to reveal the intricate pattern in the data,which cannot be adequately captured by a single feature.We processed the data using techniques including feature scaling,outlier detection and replacement,null and missing value imputation,and more to improve the data quality.Furthermore,the proposed feature engineering method uses the correlation test for numerical features and the chi-square test for categorical features to interact with the feature.To reduce the dimensionality,we subsequently used PCA with 95%variation.To identify patients with heart disease,hyperparameter-based machine learning algorithms like RF,XGBoost,Gradient Boosting,LightGBM,CatBoost,SVM,and MLP are utilized,along with ensemble models.The model’s overall prediction performance ranges from 88%to 92%.In order to attain cutting-edge results,we then used a 1D CNN model,which significantly enhanced the prediction with an accuracy score of 96.36%,precision of 96.45%,recall of 96.36%,specificity score of 99.51%and F1 score of 96.34%.The RF model produces the best results among all the classifiers in the evaluation matrix without feature interaction,with accuracy of 90.21%,precision of 90.40%,recall of 90.86%,specificity of 90.91%,and F1 score of 90.63%.Our proposed 1D CNN model is 7%superior to the one without feature engineering when compared to the suggested approach.This illustrates how interaction-focused feature analysis can produce precise and u

关 键 词:Heart disease HARMONIZATION feature interaction PCA model hyper tuning machine learning 

分 类 号:R541[医药卫生—心血管疾病] TP181[医药卫生—内科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象