基于树状模型的复杂自然语言查询转SQL技术研究  被引量:3

Converting Complex Natural Language Query to SQL Based on Tree Representation Model

在线阅读下载全文

作  者:赵猛 陈珂[1,2] 寿黎但[1,2,3] 伍赛[1,2] 陈刚[1,2] ZHAO Meng;CHEN Ke;SHOU Li-Dan;WU Sai;CHEN Gang(School of Computer Science and Technology,Zhejiang University,Hangzhou 310027,China;Key Laboratory of Big Data Intelligent Computing of Zhejiang Province(Zhejiang University),Hangzhou 310027,China;State Key Laboratory of CAD&CG,Zhejiang University,Hangzhou 310027,China)

机构地区:[1]浙江大学计算机科学与技术学院,浙江杭州310027 [2]浙江省大数据智能计算重点实验室(浙江大学),浙江杭州310027 [3]浙江大学计算机辅助设计与图形学国家重点实验室,浙江杭州310027

出  处:《软件学报》2022年第12期4727-4745,共19页Journal of Software

基  金:浙江省重点研发计划(2021C01009);国家自然科学基金(62050099);高校基本科研业务费专项。

摘  要:自然语言查询转SQL(NL2SQL)是指将自然语言表达的查询文本自动转化成数据库系统可以理解并执行的结构化查询语言SQL表达式的技术.NL2SQL可以为普通用户提供数据库查询访问的自然交互界面,从而实现基于数据库的自然问答.复杂查询的NL2SQL是当前数据库学术界的研究热点,主流方法采用序列到序列(Seq2seq)的编解码方式对问题进行建模.然而,已有的工作大多基于英文场景,面向中文领域实际应用时,中文特殊的口语化表达导致复杂查询转化困难;此外,现有工作难以正确输出包含复杂计算表达式的查询子句.针对上述问题,提出一种树状模型取代序列表示,将复杂查询自顶向下分解为多叉树,树结点代表SQL的各组成元素,采用深度优先搜索来预测生成SQL语句.在Du SQL中文NL2SQL竞赛的两个官方测试集中,该方法分别取得了第1名和第2名的成绩,验证了其有效性.NL2SQL refers to a technology that automatically converts query expressed in natural language into a structured SQL expression,which can be parsed and executed by the DBMS.NL2SQL can provide ordinary users with a natural interactive interface for database query access,thereby realizing question-answering atop database systems.NL2SQL for complex queries is now a research hotspot in the database community.The most prevalent approach uses the sequence-to-sequence(Seq2seq)encoder and decoder to convert complex natural language to SQL.However,most of the existing work focuses on English language.This approach is not ready to address the special colloquial expressions in Chinese queries.In addition,the existing work cannot correctly output query clauses containing complex calculation expressions.To solve the above problems,this study proposes to use a tree model instead of the sequence representation.The proposed approach disassembles complex queries from top to down to comprise a multi-way tree,where the tree nodes represent the elements of SQL.It uses a depth-first search to predict and generate SQL statements.The proposed approach has achieved the championship and 1st runner-up in two official tests of DuSQL Chinese NL2SQL Competition.The experimental results confirm the effectiveness of the proposed approach.

关 键 词:自然语言查询转SQL 语义解析 自然语言处理 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象