| Dataset | Description | Paper | Implementation |
|---|---|---|---|
| WikiSQL | A large crowd-sourced dataset for developing natural language interfaces for relational databases. It was released along with Seq2SQL. | ✅ | ✅ |
| Spider | A large-scale, complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 college students. | ✅ | ✅ |
| BIRD | It represents a pioneering, cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing | ✅ | ✅ |
| CoSQL | CoSQL is a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems | ✅ | ✅ |
See Benchmarks for more details.