机构地区:[1]南京医科大学公共卫生学院,江苏南京211166 [2]国家卫生健康委员会寄生虫病预防与控制技术重点实验室、江苏省寄生虫与媒介控制技术重点实验室、江苏省血吸虫病防治研究所,江苏无锡214064 [3]云南省地方病防治所、云南省自然疫源性疾病防控重点实验室
出 处:《中国血吸虫病防治杂志》2024年第6期555-561,共7页Chinese Journal of Schistosomiasis Control
基 金:国家自然科学基金(82173586,82373644);江苏省卫生健康委员会医学科研项目(x202302,M2021102);江苏省无锡市科技局“太湖之光”科技攻关项目(Y20212048)。
摘 要:目的建立基于EfficientNet-B4模型的云南省湖北钉螺滇川亚种视觉智能识别模型,并评估不同数据增强方法和模型超参数对钉螺识别效果的影响。方法2024年6月,于云南省永胜县采集湖北钉螺和拟钉螺样本各400只,各选取300只,鉴别分类后进行图像样本采集。将采集的925张钉螺和1062张拟钉螺图像作为数据集,按照8∶2的比例分为训练集和验证集;对剩余的100只钉螺和100只拟钉螺样本分别采集352张和354张图像作为外部测试集。对采集的图像进行裁剪、调整大小等预处理操作。采用基线(baseline)、Mixup和高斯模糊等3种数据增强方法;模型超参数包括自适应矩估计(adaptive moment estimation,Adam)和梯度下降法(stochastic gradient descent,SGD)2种优化器,焦点损失函数(focal loss)和交叉熵损失函数(cross entropy loss)2种损失函数以及余弦退火(cosine annealing)和多间隔调整(multi-step)2种学习率衰减策略。基于EfficientNet-B4模型建立湖北钉螺滇川亚种和拟钉螺智能识别模型,并将不同数据增强方法和不同超参数组合为7个不同训练策略组,采用外部测试集对模型性能进行测试。采用准确率、精确率、召回率、F1指数、损失值、约登指数和受试者工作特征(receiver operator characteristic,ROC)曲线下面积(area under curve,AUC)等指标评价不同训练策略下模型性能。结果采用不同数据增强方法的各组模型间损失值差异较接近。同时采用Mixup和高斯模糊数据增强方法的第4组模型性能最佳,外部测试集测试准确率为90.38%、精确率为90.07%、F1指数为89.44%、约登指数为0.81、AUC为0.961。采用SGD优化器的组别模型准确率较采用Adam优化器的组别降低29.16%(χ^(2)=81.325,P<0.001),采用交叉熵损失函数的模型准确率较第4组降低0.80%(χ^(2)=3.147,P>0.05),采用多间隔调整学习率衰减策略的模型准确率较第4组提高0.65%(χ^(2)=0.208,P>0.05)。采用基线+MiObjective To construct a visual intelligent recognition model for Oncomelania hupensis robertsoni in Yunnan Province based on the EfficientNet⁃B4 model,and to evaluate the impact of data augmentation methods and model hyperparame ters on the recognition of O.hupensis robertsoni.Methods A total of 400 O.hupensis robertsoni and 400 Tricula snails were collected from Yongsheng County,Yunnan Province in June 2024,and snail images were captured following identification and classification of 300 O.hupensis robertsoni and 300 Tricula snails.A total of 925 O.hupensis robertsoni images and 1062 Tricula snail images were collected as a dataset and divided into a training set and a validation set at a ratio of 8∶2,while 352 images captured from the remaining 100 O.hupensis robertsoni and 354 images from the remaining 100 Tricula snails served as an external test set.All acquired images were subjected to preprocessing,including cropping and resizing.Three data augmentation ap⁃proaches were employed,including baseline,Mixup and Gaussian blurring,and model hyperparameters included two optimiza⁃tion algorithms of adaptive moment estimation(Adam)and stochastic gradient descent(SGD),two loss functions of focal loss and cross entropy loss,and two learning rate decay strategies of cosine annealing and multi⁃step.The intelligent recognition models of O.hupensis robertsoni and Tricula snails were constructed based on the EfficientNet⁃B4 model,and 7 training strategy groups were generated by combinations of different data augmentation approaches and hyperparameters.The performance of intelligent recognition models was tested with external test sets,and evaluated with accuracy,precision,recall,F1 score,loss,Youden's index,and the area under the receiver operating characteristic curve(AUC)under different training strategies.Results The variation of loss values was comparable among intelligent recognition models with different data augmentation approaches.The Group 4 model constructed with Mixup and Gaussian blurring data aug
关 键 词:湖北钉螺 拟钉螺 深度学习 人工智能 计算机视觉 云南省
分 类 号:R383.24[医药卫生—医学寄生虫学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...