激光诱导击穿光谱结合RF-BPNN算法对裸鼠肺肿瘤分类的研究  

Classification Analysis of Lung Tumors in Nude Mice by Laser-induced Breakdown Spectroscopy Combined with RF-BPNN Algorithm

作  者:彭颖婕 廉倩琳 马越 陈建军 Peng Yingjie;Lian Qianlin;Ma Yue;Chen Jianjun(School of Public Health,Xinjiang Medical University,Urumqi 830017,China;Xinjiang Urumqi Meternal and Child Health Hospital,Urumqi 830001,China;School of Medical Engineering and Technology,Xinjiang Medical University,Urumqi 830017,China)

机构地区:[1]新疆医科大学公共卫生学院,乌鲁木齐830017 [2]新疆乌鲁木齐市妇幼保健院,乌鲁木齐830001 [3]新疆医科大学医学工程技术学院,乌鲁木齐830017

出  处:《科技通报》2025年第3期21-28,共8页Bulletin of Science and Technology

基  金:国家自然科学基金项目(62265016)。

摘  要:肺癌是我国乃至全世界发病率和死亡率较高的恶性肿瘤之一,其早发现、早诊断、早治疗可以显著提高肺癌患者的预后,有效降低死亡率。本文采用激光诱导击穿光谱技术(laser-induced breakdown spectroscopy,LIBS)结合机器学习算法用于诊断和鉴别裸鼠肺肿瘤和肌肉组织。实验过程使用波长532 nm、能量40 mJ的激光器对200个裸鼠切片样本(100个肺肿瘤、100个肌肉组织)进行光谱差异性探究,并采用适合数据特征的机器学习算法,用于肺肿瘤和肌肉组织的分类诊断。通过样本的光谱波峰特征选取16条强元素谱线作为机器学习算法的特征向量,比较K-最近邻(k-nearest neighbor,KNN)、支持向量机(support vector machine,SVM)、反向传播神经网络(back propagation neural network,BPNN)算法的分类精度,并选出最优分类算法;然后基于变量重要性排序,采用随机森林(random forest,RF)算法,选取高于可变重要性平均值的变量作为最优分类算法新的特征向量。通过五折交叉验证,指标包括准确率、灵敏度、特异性、受试者工作ROC曲线(receiver operating curve)以及曲线下面积AUC值(area under curve)来对模型进行评价。结果表明:(1)对比LIBS光谱图发现肺肿瘤组织和正常肌肉组织光谱种类相似,均包含有金属元素、非金属元素和分子键的特征信息。(2)在KNN、SVM、BPNN 3种算法的比较中,BPNN模型为最优分类器,其准确率、灵敏度、特异性分别达到91.67%、97.1%、84.6%,AUC值为0.924。(3)RF重要性选择后的变量由16个减少到了7个,解决了高维数据特征冗余的问题。(4)将RF算法与BPNN分类器结合后,RF-BPNN的分类准确率、灵敏度、特异性分别提高到了96.7%、100%、94.1%,AUC值为0.964。Lung cancer is one of the malignant tumors with a higher incidence and mortality in our country and even in the world,its early detection,diagnosis and treatment can significantly improve the prognosis of lung cancer patients and effectively reduce the mortality.In this paper,laser-induced breakdown spectroscopy(LIBS)combined with machine learning algorithm was used to diagnose and classify lung tumors and normal muscle tissue in nude mice.During the experiment,a laser with wavelength of 532 nm and energy of 40 mJ is applied to explore the spectral differences of 200 nude mouse section samples(100 lung tumors and 100 normal muscle tissues),and the auxiliary machine learning classification algorithm suitable for data characteristics is prefer to improve the diagnostic accuracy of lung tumors.According to the spectral peak selection,16 strong element spectral lines are selected as the feature vectors of the machine learning algorithm,and the classification accuracy of k-nearest neighbor(KNN),support vector machine(SVM)and back propagation neural network(BPNN)algorithms is compared with each other,and the optimal classification algorithm is selected;Then random forest(RF)algorithm is used to sort variables based on their importance-based principle,and variables higher than the mean value of variable importance are selected as the new feature vector of the optimal classification algorithm.The model is evaluated by 5-fold cross-validation,and the indicators include accuracy,sensitivity,specificity,receiver operating curve(ROC curve),and area under curve(AUC values).The results show that:(1)By comparing with the LIBS spectra,it can be found that the spectral types of lung tumor tissues and normal muscle tissues are similar,and both contain the characteristic information of metallic elements,nonmetallic elements and molecular bonds.(2)In the comparison of KNN,SVM and BPNN algorithms,BPNN model is the optimal classifier,with its accuracy,sensitivity and specificity reaching 91.67%,97.1%and 84.6%respectively,and AUC value

关 键 词:肺肿瘤 激光诱导击穿光谱技术 机器学习 反向传播神经网络 

分 类 号:TN249[电子电信—物理电子学] R734.2[医药卫生—肿瘤]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象