检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:姚炜[1] 冯宪伟[1] YAO Wei;FENG Xianwei(Office of Industry Education Integration,Jiangsu Vocational Institute of Commerce,Nangjing Jiangsu 211168,China)
机构地区:[1]江苏经贸职业技术学院产教融合办公室,江苏南京211168
出 处:《传感技术学报》2024年第12期2107-2112,共6页Chinese Journal of Sensors and Actuators
基 金:2024年度江苏省教育科学规划重点课题项目(B-b/2024/02/116);2024年度江苏省教育科学规划重点课题项目(B-b/2024/02/116)。
摘 要:随着计算机视觉和自然语言处理技术的不断发展,自然场景文本检测与识别技术已成为计算机视觉领域的研究热点之一。提出了一种基于多头注意力机制与长短期记忆网络(LSTM)的自然场景文本检测与识别方法。该方法通过结合目标检测算法和序列识别算法,利用多头注意力机制对图像中的文本区域进行精确的定位和特征提取,进而通过LSTM网络对提取的特征进行编码和解码,实现对自然场景中文本的准确识别。在文本检测阶段,采用基于深度学习的目标检测算法,结合多头注意力机制,通过并行计算多个独立的注意力头来捕获图像中不同尺度和方向上的文本信息,提高文本检测的准确性和鲁棒性。在文本识别阶段,利用LSTM网络对检测到的文本区域进行序列建模,通过编码和解码过程将图像中的文本信息转化为可读的字符序列。实验结果表明,所提出的方法在自然场景文本检测与识别任务上取得了优异的性能。与现有的方法相比,所提出的方法在准确性和鲁棒性方面均有所提升,尤其是在处理复杂背景和多样化文本时表现出更好的适应性。With the continuous development of computer vision and natural language processing technologies,natural scene text detection and recognition has become one of the research hotspots in the field of computer vision.A natural scene text detection and recognition method based on multi-head attention mechanism and long short-term memory(LSTM)network is proposed.The method combines object detection algorithms and sequence recognition algorithms to precisely locate and extract features of text regions in images by using a multi-head attention mechanism.Then,the extracted features are encoded and decoded by using LSTM network to achieve accurate rec-ognition of text in natural scenes.In the text detection stage,a deep learning-based object detection algorithm is used,combined with a multi-head attention mechanism,to capture text information of different scales and orientations in the image by parallel computing multi-ple independent attention heads,thereby improving the accuracy and robustness of text detection.In the text recognition stage,LSTM network is used to model the detected text regions and converts text information in the image into readable character sequences through the encoding and decoding process.Experimental results show that the method proposed achieves excellent performance in natural scene text detection and recognition tasks.Compared with existing methods,the proposed method has improved accuracy and robustness,espe-cially in handling complex backgrounds and diverse text.
关 键 词:文本检测与识别 多头注意力机制 自然场景文本 长短期记忆网络
分 类 号:TN911.73[电子电信—通信与信息系统] TP183[电子电信—信息与通信工程] TP391.43[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222