WikiSQL is a large crowd-sourced dataset for developing natural language interfaces for relational databases.
WikiSQL is a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables from Wikipedia.
To summarise, It is guaranteed that the ground-truth SQL query is of the following form
OPT-AGG (SELECT COL FROM TABLENAME
WHERE CONDITIONS
)
with:
OPT-AGG one of MAX, MIN, COUNT, SUM or nothing.COL is a column nameTABLENAME is the table name.CONDITIONS are list of conditions in the following BNF form:
CONDITIONS ::= CONDITION | CONDITIONS OP CONDITION
OP ::= OR | AND
CONDITION ::= TOKEN CMP TOKEN
CMP ::= > | < | <> | >= | <= | ==
Three main evaluation metrics were used in Spider: