嵌入卷积增强型Transformer的头影解剖关键点检测  

Cephalometric landmark keypoints localization based on convolution-enhanced Transformer

在线阅读下载全文

作  者:杨恒 顾晨亮 胡厚民 张劲[3] 李康 何凌[3] Yang Heng;Gu Chenliang;Hu Houmin;Zhang Jing;Li Kang;He Ling(School of Electrical Engineering,Sichuan University,Chengdu 610065,China;China Southwest Electronic Technology Research Institute,Chengdu 610036,China;School of Biomedical Engineering,Sichuan University,Chengdu 610065,China)

机构地区:[1]四川大学电气工程学院,成都610065 [2]中国西南电子技术研究所,成都610036 [3]四川大学生物医学工程学院,成都610065

出  处:《中国图象图形学报》2023年第11期3590-3601,共12页Journal of Image and Graphics

基  金:国家重点研发计划资助(2020YFB1711500);四川大学华西医院优秀学科1·3·5项目(ZYYC21004)。

摘  要:目的准确可靠的头像分析在正畸诊断、术前规划以及治疗评估中起着重要作用,其常依赖于解剖关键点间的相互关联。然而,人工注释往往受限于速率与准确性,并且不同位置的结构可能共享相似的图像信息,这使得基于卷积神经网络的方法难有较高的精度。Transformer在长期依赖性建模方面具有优势,这对确认关键点的位置信息有所帮助,因此开发一种结合Transformer的头影关键点自动检测算法具有重要意义。方法本文提出一种基于卷积增强型Transformer的U型架构用于侧位头影关键点定位,并将其命名为CETransNet(convolutional enhanced Transformer network)。通过改进Transformer模块并将其引入至U型结构中,在建立全局上下文连接的同时也保留了卷积神经网络获取局部信息的能力。此外,为更好地回归预测热图,提出了一种指数加权损失函数,使得监督学习过程中关键点附近像素的损失值能得到更多关注,并抑制远处像素的损失。结果在2个测试集上,CETransNet分别实现了1.09 mm和1.39 mm的定位误差值,并且2 mm内精度达到了87.19%和76.08%。此外,测试集1中共有9个标志点达到了100%的4 mm检测精度,同时多达12个点获得了90%以上的2 mm检测精度;测试集2中,尽管只有9个点满足90%的2 mm检测精度,但4 mm范围内有10个点被完全检测。结论CETransNet能够快速、准确且具备鲁棒性地检测出解剖点的位置,性能优于目前先进方法,并展示出一定的临床应用价值。Objective Accurate and reliable cephalometric image measurement and analysis,which usually depend on the correlation among anatomical landmark points,play essential roles in orthodontic diagnosis,preoperative planning,and treatment evaluation.However,manual annotation hinders the speed and accuracy of measurement to a certain extent.Therefore,an automatic cephalometric landmark detection algorithm for daily diagnosis needs to be developed.However,the size of anatomical landmarks accounts for a small proportion of an image,and the structures at different positions may share similar radians,shapes,and surrounding soft tissue information that are difficult to distinguish.The current methods based on convolutional neural networks(CNNs)extract depth features by applying down-sampling to facilitate the building of a global connection,but these methods may suffer from spatial information loss and inefficient context modeling,hence preventing them from meeting accuracy requirements in clinical applications.Transformer has advantages in long-term dependency modeling but is not good at capturing local features,hence explaining the insufficient accuracy of models based on pure Transformer for key point localization.Therefore,an end-to-end model with global context modeling and better local spatial feature representation must be built to solve these problems.Method To detect the anatomical landmarks effi⁃ciently and effectively,a U-shaped architecture based on convolution-enhanced Transformer called CETransNet is pro⁃posed in this manuscript to locate the key points of lateral cephalometric images.The overwhelming success of UNet lies in its ability to analyze the local fine-grained nature of an image at the deep level,but this method suffers from global spatial information loss.By improving and introducing the Transformer module into the U-shaped structure,the ability of convolu⁃tional networks to obtain local information is retained while establishing global context connection.In addition,to effi⁃ciently regress an

关 键 词:头影测量 关键点检测 视觉Transformer 注意力机制 热图回归 卷积神经网络(CNN) 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象