Schema linking plays a crucial role in Text2SQL processes, aiming to identify referenced database schema (such as tables and columns) and database values within natural language questions.
Schema Linking is used mainly to remove unnecessary tables and columns from the question, and thus giving the model the most relevant tables and columns as an alternative database schema.
There are primarily two strategies for schema-linking:
The string matching-based approach, simple yet effective, identifies schemas and values related to a question through direct string matching.
This approach has limitations in certain scenarios, such as dealing with synonyms.
These methods are designed to assess the relevance of schemas and values at a semantic level. Once the schema linking results are obtained, for example, the matching degrees for all tables and columns, many techniques incorporate these results as additional input for the text-to-SQL model.