一种基于笔画宽度特征和半监督多示例学习的文本区域鉴别方法  

A text region identification method based on stroke width features and semi-supervised multi-instance learning

在线阅读下载全文

作  者:吴锐[1] 杜庆安 张博宇[1] 黄庆成[1] 

机构地区:[1]哈尔滨工业大学计算机科学与技术学院,哈尔滨150001 [2]天津航天机电设备研究所,天津300000

出  处:《高技术通讯》2016年第2期111-118,共8页Chinese High Technology Letters

基  金:国家自然科学基金(61370162;61440025);中央高校基本科研业务费专项资金(HIT.NSRIF.2012048)资助项目

摘  要:考虑到文本区域鉴别在视频文本检测中的重要作用,提出了一种基于笔画宽度特征的文本区域鉴别方法,该方法通过分析候选文本区域中笔画宽度的分布,有效地区分文本和非文本区域。此外针对笔画宽度信息提取过程中存在未知极性参数的问题,提出了一种半监督多示例学习(SS-MIL)算法,该算法可以充分利用训练样本中不完整的监督信息,提高文本区域分类器的性能。基于上述方法,实现了一个完整的视频文本检测系统,并在具有代表性的数据集上对其进行了充分的实验,实验结果表明,基于笔画宽度特征和SS-MIL的文本区域鉴别方法能够有效地辨别文本区域,从而使该系统检测视频文本的综合性能达到较高水平。In consideration of the importance of text region identification to video text detection, a new text region identl- fication method based on stroke width features was proposed. The proposed method can effectively distinguish text regions form non-text regions by analyzing the distribution of the stroke width information in candidate text regions. Moreover, a new semi-supervised multi-instance semi-supervised learning (SS-MIL) algorithm was given to solve the problem that the polar parameter is uncertain in the process of extracting stroke width feature information. The proposed SS-MIL algorithm can improve the performance of region classifier by utilizing incomplete sample labels in training data. A complete video text detection system was implemented based on the proposed methods, and it was tested thoroghty by using the typical data sets such as MCTS. The results showed that the text region identification based on stroke width features and SS-MIL was effective, so the video text detection system achieved the higher overall performance in video test detection.

关 键 词:文本区域鉴别 笔画宽度 半监督学习 多示例学习(MIL) 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象