基于ERNIE及改进DPCNN的棉花病虫害问句意图识别  

Intentional Recognition of Cotton Disease and Pest Questions Based on ERNIE and Improved DPCNN

在线阅读下载全文

作  者:李东亚 白涛[1,2,3] 香慧敏[4] 戴硕 王震鲁 Li Dongya;Bai Tao;Xiang Huimin;Dai Shuo;Wang Zhenlu(College of Computer and Information Engineering,Xinjiang Agricultural University,Urumqi 830052,China;Intelligent Agriculture Engineering Research Center of the Ministry of Education,Urumqi 830052,China;Xinjiang Agricultural Informatization Engineering Technology Research Center,Urumqi 830052,China;Xinjiang Science and Technology College,Urumqi 830049,China)

机构地区:[1]新疆农业大学计算机与信息工程学院,新疆乌鲁木齐830052 [2]智能农业教育部工程研究中心,新疆乌鲁木齐830052 [3]新疆农业信息化工程技术研究中心,新疆乌鲁木齐830052 [4]新疆科信职业技术学院,新疆乌鲁木齐830049

出  处:《山东农业科学》2024年第6期143-151,共9页Shandong Agricultural Sciences

基  金:科技部科技创新2030重大项目“群体智能自主作业智慧农场”(2022ZD0115800);新疆维吾尔自治区重大科技专项“农场智能平台关键技术研究”(2022A02011-4);新疆维吾尔自治区高校基本科研业务费科研项目“农业大数据交换共享与可视化平台”(XJEDU2022J009)。

摘  要:针对目前没有公开的棉花病虫害相关问句数据集且问句较短、类型多样等问题,本研究通过查阅文献及咨询相关领域专家,构建了棉花病虫害问句数据集CQCls,定义了78种棉花病虫害实体和9种问句类型;同时提出了一种基于ERNIE预训练模型的棉花病虫害问句意图识别模型,首先通过ERNIE模型将输入问句映射到向量空间,使用融合词位置信息的DPCNN模型进行特征向量的抽取,与基础的DPCNN模型相比,通过融合词位置信息能有效提高模型的表达能力,然后经过Softmax得到最终结果。实验结果表明,本研究提出的意图识别模型相较于其他模型取得了较好的结果,宏平均和加权平均的F1分数值分别为97.45%和97.31%;在文本语料数据内容复杂多样且文本格式不规范的DMSCD数据集上,训练结果中不同类别的F1分数的权重平均也能达到73.42%,进一步证明了该模型的有效性及泛化能力。Aiming at the problems that there is no publicly available question data set related to cotton pests and diseases,and the cotton pest and disease questions are short in length and various in type,the CQ⁃Cls data set of cotton pest and disease questions was established containing 78 species of disease and pest enti⁃ties and 9 types of questions.An intention recognition model of cotton disease and pest questions based on the ERNIE pre⁃training model was proposed.Firstly,the input questions were mapped into the vector space through the ERNIE model;secondly,the feature vector was extracted using the DPCNN model that fused word location information,which could effectively improve the expression ability compared with the basic DPCNN model;and then the final results could be obtained through Softmax.The test results showed that the intention recognition model proposed in this study achieved better results compared to other models,with the values of 97.45%and 97.31%for macro average and weighted average F1 score,respectively.On the DMSCD data set with complex and diverse text corpora and non⁃standard text formats,the average weight of F1 scores for differ⁃ent categories in the training results could also reach 73.42%,further proving the effectiveness and generaliza⁃tion ability of the model proposed in this paper.

关 键 词:棉花病虫害 问句意图识别 ERNIE模型 DPCNN模型 词位置信息 

分 类 号:S435.62[农业科学—农业昆虫与害虫防治] TP391.1[农业科学—植物保护]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象