机构地区:[1]北方民族大学计算机科学与工程学院,银川750021 [2]北方民族大学图像图形智能处理国家民委重点实验室,银川750021 [3]宁夏医科大学医学信息与工程学院,银川1750004
出 处:《中国图象图形学报》2024年第4期1070-1084,共15页Journal of Image and Graphics
基 金:国家自然科学基金项目(62062003);宁夏自然科学基金项目(2023AAC03293)。
摘 要:目的 肺部肿瘤早期症状不典型易导致错过最佳治疗时间,有效准确的肺部肿瘤检测技术在计算机辅助诊断中变得日益重要,但在肺部肿瘤PET/CT(positron emission computed tomography/computed tomography)多模态影像中,肿瘤与周围组织粘连导致边缘模糊和对比度低,且存在病灶区域小、大小分布不均衡等问题。针对上述问题,提出一种跨模态注意力YOLOv5(cross-modal attention you only look once v5, CA-YOLOv5)的肺部肿瘤检测模型。方法首先,在主干网络中设计双分支并行的自学习注意力,利用实例归一化学习比例系数,同时利用特征值与平均值之间差值计算每个特征所包含信息量,增强肿瘤特征和提高对比度;其次,为充分学习多模态影像的多模态优势信息,设计跨模态注意力对多模态特征进行交互式学习,其中Transformer用于建模深浅层特征的远距离相互依赖关系,学习功能和解剖信息以提高肺部肿瘤识别能力;最后,针对病灶区域小、大小分布不均衡的问题,设计动态特征增强模块,利用不同感受野的多分支分组扩张卷积和分组可变形卷积,使网络充分高效挖掘肺部肿瘤特征的多尺度语义信息。结果 在肺部肿瘤PET/CT数据集上与其他10种方法进行性能对比,CA-YOLOv5获得了97.37%精度、94.01%召回率、96.36%mAP(mean average precision)和95.67%F1的最佳性能,并且在同设备上训练耗时最短。在LUNA16(lung nodule analysis 16)数据集中本文同样获得了97.52%精度和97.45%mAP的最佳性能。结论 本文基于多模态互补特征提出跨模态注意力YOLOv5检测模型,利用注意力机制和多尺度语义信息,实现了肺部肿瘤检测模型在多模态影像上的有效识别,使模型识别更加准确和更具鲁棒性。Objective Cancer is the second leading cause of death worldwide,with nearly one in five patients dying from lung cancer.Many cancers have a high chance of cure through early detection and effective therapeutic care.However,atypical early symptoms of lung cancer can easily lead to missed optimal treatment time.Treatment procedures can be utilized to reduce the risk of death with the successful identification of benign and malignant cancer.Manual determination of lung cancer is a time-consuming and error-prone process,and effective and accurate lung cancer detection techniques are becoming increasingly important in computer-aided diagnosis.Method Computed tomography is a common clinical modality for examining lung conditions by localizing lesion structures through anatomical information,and positron emission computed tomography can reveal the pathophysiological features of lesions by detecting glucose metabolism.Combining positron emission computed tomography(PET)/computed tomography(CT)has been shown to be effective in cases where conventional imaging is inadequate,identifying lesions while pinpointing them,which improves accuracy and clinical value.However,in PET/CT images of lung cancer,adhesion of cancer to surrounding tissues leads to blurred edges and low contrast,and problems such as small lesion areas and uneven size distribution are encountered.A cross-modal attention YOLOv5(CA-YOLOv5)model for lung cancer detection is proposed in this paper to address the above problems.This model focuses on the following:First,a two-branch parallel self-learning attention is designed in the backbone network to learn the scaling factor using instance normalization and also calculate the amount of information contained in each feature using the difference between feature and average values.Self-learning attention enhances cancer features and improves contrast.Second,cross-modal attention is designed to facilitate the interactive learning of multimodal features to fully learn the multimodal dominant information of 3D multim
关 键 词:YOLOv5检测 自学习注意力 跨模态注意力 动态特征增强模块 PET/CT肺部肿瘤数据集
分 类 号:TP394.1[自动化与计算机技术—计算机应用技术] TH691.9[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...