结合注意力机制和编码器—解码器架构的化学结构识别方法  被引量:1

Chemical structure recognition method based on attention mechanism and encoder-decoder architecture

在线阅读下载全文

作  者:曾水玲[1,2] 李昭贤 张嘉雄 丁龙飞 赵才荣 Zeng Shuiling;Li Zhaoxian;Zhang Jiaxiong;Ding Longfei;Zhao Cairong(School of Communication and Electronic Engineering,Jishou University,Jishou 416000,China;Key Laboratory of Image and Video Understanding for Social Safety,Nanjing University of Science and Technology,Nanjing 210094,China;College of Electronics and Information Engineering,Tongji University,Shanghai 201804,China)

机构地区:[1]吉首大学通信与电子工程学院,吉首416000 [2]南京理工大学江苏省社会安全图像与视频理解重点实验室,南京210094 [3]同济大学电子与信息工程学院,上海201804

出  处:《中国图象图形学报》2024年第7期1960-1969,共10页Journal of Image and Graphics

基  金:国家自然科学基金项目(61966014);湖南省自然科学基金项目(2024JJ7413);江苏省社会安全图像与视频理解重点实验室开放课题项目(202212);吉首大学校级科研项目(JGY2023071,Jdy23042);湖南省研究生科研创新项目(QL20230255,CX20221107)。

摘  要:目的 化学结构识别是化学和计算机视觉领域的一个重要问题,传统光学化学结构识别技术在复杂化学结构识别任务中易发生信息丢失或误识别的现象,同时又因为化学物质的结构多样性常导致其无法解析,识别效果不佳。而基于深度学习的模型通常具有网络结构复杂度高、上下文信息易丢失和识别率低的问题。为此,提出一种结合注意力机制和编码器—解码器架构的化学结构识别方法。方法 首先,使用改进的ResNet50(residual network)作为特征提取器抓取表征信息;其次,使用BLSTM(bi-directional long-short term memory)作为行编码器为ResNet50提取的表征信息加强空间信息;最后,使用去填充模块和基于覆盖注意力机制的LSTM(long short-term memory)网络作为模型解码器,对化学结构图像进行解码,将编码结果解码为SMILES(simplified molecular input line entry system)序列。结果 在Indigo、ChemDraw、CLEF(Conference and Labs of the Evaluation Forum)、JPO(Japanese Patent Office)、UOB(University of Birmingham)、USPTO(United States Patent and Trademark Office)、Staker、ACS(American Chemistry Society)、CASIA-CSDB(Institute of Automation of Chinese Academy of Sciences—Chemical Structure Database)和Mini CASIA-CSDB数据集上,所提方法识别准确率分别为71.1%、70.21%、45.8%、30.3%、53.02%、58.21%、43.39%、46.3%、84.42%和85.78%,高于SwimOCSR、Image2Mol和ChemPix模型得分。结论 与其他模型相比,本文方法通过少量训练集能够获得较高的识别准确率。Objective Emerging digital and intelligent technologies have ushered in a new era of text recognition and interpretation.These advancements have greatly facilitated the ability to recognize and comprehend textual content originating from a variety of sources,including paper documents,photographs,and diverse contexts.One particularly noteworthy application of these technologies is in the field of chemical structure image recognition,where portable devices such as mobile phones and tablet PCs have become indispensable tools,playing a vital role in converting hand-drawn chemical structure images into machine-readable formats.They translate these intricate structures into human-readable representations,simultaneously highlighting relevant physical properties,chemical characteristics,and elemental compositions.These innovative models for chemical structure recognition serve as a bridge between hand-drawn representations and machine-interpretable data.This capability has made it feasible to electronically document complex scenarios,such as those encountered in classrooms and academic meetings.Notably,ongoing research has focused on developing encoderdecoder-based methods for mathematical expression recognition,which have shown promising results.However,the pivotal role of the quality and quantity of training data in shaping the performance of deep neural networks needs to be acknowledged.The current challenge lies in the absence of a comprehensive,high-quality dataset that is specifically tailored for chemical structure image recognition.This data deficiency poses a significant hurdle,impacting the optimization,generalization,and robustness of the models.Furthermore,the computational demands of real-time offline recognition on mobile devices remain a practical limitation.Method To address the aforementioned issues,we developed a chemical structure recognition model based on an encoder-decoder architecture.This model is capable of generating corresponding character representations,such as SMILES,from given chemical stru

关 键 词:化学结构识别 编码器—解码器 注意力机制 残差网络 SMILES(simplified molecular input line entry system) 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象