BIRD1 (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) represents a pioneering, cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing. BIRD contains over 12,751 unique question-SQL pairs and 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain, hockey, healthcare and education, etc.
Unlike previous Text2SQL datasets, BIRD is more similar to real world databases due to the potentially large numbers of rows per table.
Unlike Spider, the Question Clarity assumption is violated. In fact, some questions needs External Knowledge to be answered correctly. Such as:
OVER,JULIANDATE,CAST,ROUND,SUBSTR
C
in a SQL query.Four main evaluation metrics were used in Spider: