RESDSQL

Introduction

Ranking-enhanced Encoding plus a Skeleton-aware
Decoding framework for Text-to-SQL(RESDSQL)[1] is a simple, but powerful Text2SQL model.

As its name suggests, it is composed of two main components:

  1. Ranking Enhanced Encoder:
  2. Skeleton-Aware Decoder:
Incoming Technicalities

A descent understanding on Transformers is recommended for the subsequent sections.

Ranking Enhanced Encoder

Introduction

Instead of injecting all schema items, only the most relevant schema items in the input of the encoder will be considered.

For this purpose, cross-encoder is used to classify the tables and columns simultaneously and then rank them based on their probabilities. Based on the ranking order, on one hand,
we filter out the irrelevant schema items

Input

The input is represented as follows:

Where

  • are the tables
  • the columns of
  • is a delimiter

Encoding Module

will be fed into RoBERTa[2] to embed all the tokens. Since a schema item can be composed of multiple tokens, an additional pooling module is applied, consisting of:

  1. A 2-layer BiLSTM[3].
  2. A non-linear fully connected layer.

This module will be denoted by

Column-Enhanced Layer

Some questions only mention the column name rather than the table
name. This may compromise the ranking of the tables, since in that case, the table is still required in the SQL query.

Example

Question: What are flight numbers of flights departing from City "Aberdeen"?
Schema:

  • airlines:uid, airline, abbreviation, country
  • airports: city, airportcode, airportname, country
  • flights: airline, flightno, sourceairport, destairport

Note: Table airports is required since its column city is mentionned

To mitigate this, we regroup the column outputs of by table into A multi-headed attention layer will be applied as follows:

Skeleton-Aware Decoder

SQL Normalisation

To reduce the learning difficulty, the original SQL queries are normalized before training by:

  1. Unifying the keywords and schema items into lowercase
  2. Adding spaces around parentheses and replacing double quotes with single quotes
  3. Adding an ASC keyword after the ORDER BY clause if it does not specify the order.
  4. Removing the AS clause and replacing all table aliases with their original names

SQL Skeleton Extraction

Based on the normalized SQL queries, their skeletons will be extracted which only contain
SQL keywords and slots. Specifically, given a normalized SQL query, only the the keywords will be left, and the other tokens will be replace by a symbol.

Example

Normalized SQL:

select files.duration, files.file size, files.formats
   from files join song on files.f id = song.f id where
   song.genre is = ‘pop’ order by song.song name asc

SQL Skeleton:

select _ from _ where _ order by _ asc

Execution-Guided SQL Selector

Since the decoder is not constrained with SQL grammar, the model is susceptible to generate illegal SQL queries.

To alleviate this problem, an execution-guided SQL selector[4] was introduced. It performs the beam search[5] during the decoding procedure and then selects the first executable SQL query in the beam as the final result.

Notes & References


  1. Haoyang Li, Jing Zhang, Cuiping Li, & Hong Chen. (2023). RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL.↩︎
  2. Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; and Stoyanov, V. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.↩︎
  3. Hochreiter, S.; and Schmidhuber, J. 1997. Long Short-Term Memory. Neural Comput., 1735–1780.↩︎
  4. Suhr et al., ACL 2020. Exploring Unexplored Generalization Challenges for Cross-Database Semantic Parsing↩︎
  5. Similar to PICARD↩︎