不平衡数据集的CT结肠镜息肉检测方法  

Polyp detection in CT colonography based on imbalanced data sets

在线阅读下载全文

作  者:熊馨 徐礼胜[1] 王春武[1] 康雁[1] 

机构地区:[1]东北大学中荷生物医学与信息工程学院,沈阳110819

出  处:《哈尔滨工业大学学报》2013年第11期112-117,共6页Journal of Harbin Institute of Technology

基  金:国家自然科学基金资助项目(61071213)

摘  要:目前CT结肠镜的息肉检测分类器面临着数据集不平衡问题,数据集中的正样本(息肉)的数量远远小于负样本.针对这个问题,息肉检测分类器采用SMOTEBoost,结合SMOTE(Synthetic Minority Over-Sampling Technique)和Boosting:在数据层面,采用过采样技术SMOTE合成少数类样本,减轻数据集中两类样本的不平衡程度;在算法层面,采用Boosting方法提高弱分类器的性能,两者结合起来,既改善对少数类样本的预测能力,又保证了对整个数据集的分类精度.为了满足息肉检测对算法实时性的需求,采用MRMR(Minimum Redundancy Maximum Relevance)方法挑选最大相关、最小冗余的简单特征组成级联第1层强分类器,拒绝大多数负样本,极大地提高了分类器的处理速度.实验结果表明:设计的分类器检测直径大于5 mm息肉的敏感度达到90%,每个数据体6个假阳.Polyp detection in CT Colongraphy suffers from imbalanced data sets where negative samples (non- polyp) are dominant. In data level, SMOTE (Synthetic Minority Over-Sampling Technique) was applied to alleviate imbalanced degree by synthetic minority samples. In algorithm level, Boosting approach was employed in order to improve classification performance. Having combined Boosting with SMOTE (SMOTEBoost), the proposed classifier not only improved the prediction of the minority samples, but also guaranteed the accuracy over the entire data set. To satisfy real-time requirements for polyp detection, MRMR (Minimum Redundancy Maximum Relevance) was provided to select low-cost simple features for training the first stage of cascade, resulting in refusing the great majority negative samples and speeding procession. The experimental results showed that the classifier could achieve an overall per-polyp sensitivity of 90% (corresponding to the polyp whose diameter is equal to or greater than 5 mm), with false positives of 6 per volume on average.

关 键 词:不平衡数据集 CT结肠镜 结肠息肉检测 重采样 BOOSTING CASCADE ADABOOST 

分 类 号:R814[医药卫生—影像医学与核医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象