Dataset | Description | Paper | Implementation |
---|---|---|---|
WikiSQL | A large crowd-sourced dataset for developing natural language interfaces for relational databases. It was released along with Seq2SQL. | ✅ | ✅ |
Spider | A large-scale, complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 college students. | ✅ | ✅ |
BIRD | It represents a pioneering, cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing | ✅ | ✅ |
CoSQL | CoSQL is a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems | ✅ | ✅ |
See Benchmarks for more details.