机构地区:[1]北京农业信息技术研究中心,北京100097 [2]河北农业大学信息科学与技术学院,保定071001 [3]国家农业信息化工程技术研究中心,北京100097 [4]河北省农业大数据重点实验室,保定071001 [5]河北农业大学机电工程学院,保定071001
出 处:《农业工程学报》2021年第11期180-188,共9页Transactions of the Chinese Society of Agricultural Engineering
基 金:北京市科技计划项目(Z191100004019007);国家重点研发计划(2020YFD1100602,2019YFD1101105);河北省重点研发计划项目(20327402D);河北省省属高等学校基本科研业务费研究项目(KY202004);河北省引进留学人员资助项目(C20190340)。
摘 要:基于深度卷积神经网络的视觉识别方法在病害诊断中表现出色,逐渐成为了研究热点。但是,基于深度卷积神经网络建立的视觉识别模型通常只利用了图像模态的数据,导致模型的识别准确率和鲁棒性,都依赖训练数据集的规模和标注的质量。构建开放环境下大规模的病害数据集并完成高质量的标注,通常需要付出巨大的经济和技术代价,限制了基于深度卷积神经网络的视觉识别方法在实际应用中的推广。该研究提出了一种基于图像与文本双模态联合表征学习的开放环境下作物病害识别模型(bimodalNet)。该模型在病害图像模态的基础上,进行了病害文本模态信息的嵌入,利用两种模态病害信息间的相关性和互补性,实现了病害特征的联合表征学习。最终bimodalNet在较小的数据集上取得了优于单纯的图像模态模型和文本模态模型的效果,最优模型组合在测试集的准确率、精确率、灵敏度、特异性和F1值分别为99.47%、98.51%、98.61%、99.68%和98.51%。该研究证明了利用病害图像和病害文本的双模态表征学习是解决开放环境下作物病害识别的有效方法。Recognition using Deep Convolution Neural Networks(DCNN)performs well in plant disease diagnosis.However,the recognition accuracy and robustness of the model depend mainly on the scale of the training dataset and the quality of annotation.Particularly,the image modal data is only used in DCNN recognition models.It is necessary to build a large-scale disease dataset and complete high-quality annotation in open environments.Nevertheless,the huge economic and technical costs have limited the promotion of DCNN recognition in practical applications.Inspired by multimodal learning,a crop disease recognition model was proposed in an open environment using a flexible image and text bimodal joint representation learning,called bimodalNet.The image-associated text information was brought into the disease recognition task.The input of the network was the image-text pair composed of disease image and description text.The disease images were clipped and adjusted to 224×224 pixels.The description character of the disease was Chinese.A series of operations was completed for the text embedding,including normalization,word segmentation,word list construction,and text vectorization.As such,the network structure consisted of two parts:image and text branches.CNN was used in the image branch to extract disease features from images,whereas,the circular neural network was used in the text branch to learn disease features from description text.The correlation and complementarity were utilized between the two modes of disease information,thereby realizing the joint representation learning of disease features in bimodalNet.The output of image and text branches were added,and then fused into the output of networks.Different image and text classifiers were used to meet the needs of various tasks.As such,the best combination of disease feature extraction was selected from six networks in the experiments.The experimental dataset consisted of 1834 disease image-text pairs.Among them,the disease images were all taken in the field environment,
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...