基于语义增强模式链接的Text-to-SQL模型  被引量:1

Text-to-SQL model based on semantic enhanced schema linking

在线阅读下载全文

作  者:吴相岚 肖洋 刘梦莹 刘明铭[1] WU Xianglan;XIAO Yang;LIU Mengying;LIU Mingming(College of Software,Nankai University,Tianjin 300457,China)

机构地区:[1]南开大学软件学院,天津300457

出  处:《计算机应用》2024年第9期2689-2695,共7页journal of Computer Applications

摘  要:为优化基于异构图编码器的Text-to-SQL生成效果,提出SELSQL模型。首先,模型采用端到端的学习框架,使用双曲空间下的庞加莱距离度量替代欧氏距离度量,以此优化使用探针技术从预训练语言模型中构建的语义增强的模式链接图;其次,利用K头加权的余弦相似度以及图正则化方法学习相似度度量图使得初始模式链接图在训练中迭代优化;最后,使用改良的关系图注意力网络(RGAT)图编码器以及多头注意力机制对两个模块的联合语义模式链接图进行编码,并且使用基于语法的神经语义解码器和预定义的结构化语言进行结构化查询语言(SQL)语句解码。在Spider数据集上的实验结果表明,使用ELECTRA-large预训练模型时,SELSQL模型比最佳基线模型的准确率提升了2.5个百分点,对于复杂SQL语句生成的提升效果很大。To optimize Text-to-SQL generation performance based on heterogeneous graph encoder,SELSQL model was proposed.Firstly,an end-to-end learning framework was employed by the model,and the Poincarédistance metric in hyperbolic space was used instead of the Euclidean distance metric to optimize semantically enhanced schema linking graph constructed by the pre-trained language model using probe technology.Secondly,K-head weighted cosine similarity and graph regularization method were used to learn the similarity metric graph so that the initial schema linking graph was iteratively optimized during training.Finally,the improved Relational Graph ATtention network(RGAT)graph encoder and multi-head attention mechanism were used to encode the joint semantic schema linking graphs of the two modules,and Structured Query Language(SQL)statement decoding was solved using a grammar-based neural semantic decoder and a predefined structured language.Experimental results on Spider dataset show that when using ELECTRA-large pre-training model,the accuracy of SELSQL model is increased by 2.5 percentage points compared with the best baseline model,which has a great improvement effect on the generation of complex SQL statements.

关 键 词:模式链接 图结构学习 预训练语言模型 Text-to-SQL 异构图 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象