基于依存关系图注意力网络的SQL生成方法  

SQL generation method based on dependency relational graph attention network

在线阅读下载全文

作  者:舒晴 刘喜平[1] 谭钊 李希 万常选[1] 刘德喜[1] 廖国琼[1] SHU Qing;LIU Xiping;TAN Zhao;LI Xi;WAN Changxuan;LIU Dexi;LIAO Guoqiong(School of Information Technology,Jiangxi University of Finance and Economics,Nanchang 330013,China;School of Software,Jiangxi Agricultural University,Nanchang 330013,China)

机构地区:[1]江西财经大学信息管理学院,江西南昌330013 [2]江西农业大学软件学院,江西南昌330013

出  处:《浙江大学学报(工学版)》2024年第5期908-917,共10页Journal of Zhejiang University:Engineering Science

基  金:国家自然科学基金资助项目(62076112,62272205,62272206,62272207);江西省自然科学基金资助项目(20232ACB202008);江西省教育厅科学技术研究项目(GJJ190255);江西省研究生创新专项资助项目(YC2023-B185)。

摘  要:研究基于自然语言问题的结构化查询语言(SQL)生成问题(Text-to-SQL).提出两阶段框架,旨在解耦模式链接和SQL生成过程,降低SQL生成的难度.第1阶段通过基于关系图注意力网络的模式链接器识别问题中提及的数据库表、列和值,利用问题的语法结构和数据库模式项之间的内部关系,指导模型学习问题与数据库的对齐关系.构建问题图时,针对Text-to-SQL任务的特点,在原始句法依存树的基础上,合并与模式链接无关的关系,添加并列结构中的从属词与句中其他成分间的依存关系,帮助模型捕获长距离依赖关系.第2阶段进行SQL生成,将对齐信息注入T5的编码器,对T5进行微调.在Spider、Spider-DK和Spider-Syn数据集上进行实验,实验结果显示,该方法具有良好的性能,尤其是对中等难度以上的Text-to-SQL问题具有良好的表现.The problem of generating structured query language(SQL)from natural language questions(Text-to-SQL)was analyzed.A two-stage framework was proposed to decouple the processes of schema linking and SQL generation in order to reduce the complexity of SQL generation.Database tables,columns,and values mentioned in the question were identified by a schema linker based on relational graph attention network in the first stage.The syntactic structure of the question and the internal relationships between database schema items were used to guide the model in learning the alignment between the question and the database.The original syntactic dependency tree was modified by merging relationships irrelevant to schema linking and adding dependencies between subordinating conjunctions in parallel structures and other elements in the sentence in view of the characteristics of Text-to-SQL task when constructing the question graph,which helps the model capture long-distance dependencies.SQL generation was performed by injecting the alignment information into the T5 encoder and fine-tuning it in the second stage.Experiments were conducted on the Spider,Spider-DK and Spider-Syn datasets.Results showed that the method performed well,especially for Text-to-SQL problems of medium difficulty and above.

关 键 词:Text-to-SQL 自然语言查询 依存句法分析 关系图注意力网络 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象