DTS-SQL

fw#model #schema-linking

Introduction

A Text2SQL model is used to predict an SQL query given a natural language representation and a database schema

We assume that the sub-objectives of the model include:

  1. Schema Linking: Extracting the tables relevant to the query.
  2. SQL Generation: Generating the SQL query given and

DTS-SQL proposes mainly the segregation of sub-objectives into their own sub-models. Each sub-model is then fine-tuned.

DTS-SQL

Schema Linking

Introduction

Schema linking involves identifying the pertinent tables (and columns) in a database schema in response to natural language query . It has been demonstrated to enhance cross-domain generalizability[1] and facilitate the creation of intricate queries.

Priorly, schema linking has primarily been accomplished through in-context learning methods or implicitly during the fine-tuning process for SQL generation[2].

Modelisation

The objective is to find the optimal model and prompt

SQL Generation

Introduction

After identifying the appropriate tables for SQL generation, the next step is to utilize a model that constructs the SQL query based on the question and the schema of the correct tables.

This can be thought as the original Text2SQL problem, with the additional assumption that the Schema-Linking is optimal.

Modelisation

The objective is to find the optimal model and prompt

Finally, the target model is the one defined by

Results

Model Schema-Linking EX EM
Mistral 7B All Tables 71.9 70.9
Mistral 7B DTS-SQL 78.6 73.3
Mistral 7B Perfect Schema-Linking 86.6 80.7
DeepSeek 7B All Tables 82.1 69.0
DeepSeek 7B DTS-SQL 85.5 79.1
DeepSeek 7B Perfect Schema-Linking 90.3 84.2

  1. Wenqiang Lei, Weixin Wang, Zhixin Ma, Tian Gan, Wei Lu, Min-Yen Kan, and Tat-Seng Chua. 2020. Reexamining the role of schema linking in text-to-sql. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6943–6954.↩︎
  2. Mohammadreza Pourreza and Davood Rafiei. 2023. Din-sql: Decomposed in-context learning of text-to-sql with self-correction. arXiv preprint arXiv:2304.11015.↩︎