面向社交媒体的藏文图文多字体检测与识别研究  

Study on the Multi-Font Detection and Recognition of Tibetan in the Social Media

在线阅读下载全文

作  者:拥措 龙炳鑫 拉毛杰 仁青东主 尼玛扎西 Yongtso;LONG Bingxin;Lamao-Jie;Renqing-Dongzhu;Nyima-Trashi(School of Information Science and Technology,Tibet University,Lhasa 850000,China;Tibetan Information Technology and Artificial Intelligence Key Laboratory of Tibet Autonomous Region,Lhasa 850000,China;Engineering Research Center of Tibetan Information Technology,Ministry of Education,Lhasa 850000,China)

机构地区:[1]西藏大学信息科学技术学院,西藏拉萨850000 [2]西藏自治区藏文信息技术人工智能重点实验室,西藏拉萨850000 [3]藏文信息技术教育部工程研究中心,西藏拉萨850000

出  处:《高原科学研究》2023年第4期76-85,共10页Plateau Science Research

基  金:科技创新2030——“新一代人工智能”重大项目(2022ZD0116100);西藏自治区科技创新基地自主研究项目(XZ2021JR0002G)。

摘  要:社交媒体为大众沟通交流与信息传播提供了更为便捷的平台。文章针对当前社交媒体中藏文图文背景复杂、多字体、字体混排和版式多样等特点,构建了社交媒体藏文图文识别数据集,提出一种融合PSENET和CRNN(卷积循环神经网络)的端到端检测识别算法。该算法利用PSENET进行多角度的文本检测,再结合基于多头注意力机制的CRNN模型进行文字识别。实验结果表明,检测率和多字体识别率分别达到了95.7%和84.5%,相较于无预训练模型和CTC(连接时序分类)识别模型,准确率分别提高了34.6%和4.14%。表明该方法在解决社交媒体中藏文图文多字体识别问题上具有较好的实用价值和应用前景。This paper discusses the issue of recognition of Tibetan text with multi-font in social media.Extracting Tibetan text from images and texts in social media is of great importance for understanding the content and conducting knowledge mining.Concerning the characteristics of social media such as complex backgrounds,multiple fonts,diverse layouts,etc.,the paper proposes an end-to-end detection and recognition algorithm combined with PSENET and CRNN.The algorithm utilizes PSENET for multi-angle text detection and combines it with a CRNN model based on a multi-head attention mechanism for text recognition.Experimental results show that the detection rate and the multi-font recognition rate of the algorithm are 95.7% and 84.5%.Compared with the algorithm without pre-training and CTC recognition,the accuracy of our proposed algorithm improves by 34.6%and 4.14%,respectively,which indicates that our proposed algorithm has high practical value and promising applications in solving the issue of multi-font recognition of Tibetan text in social media.

关 键 词:社交媒体 藏文 多字体 文字识别 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象