Prompt engineering is a relatively new discipline for developing and optimizing prompts to efficiently use language models (LMs) for a wide variety of applications and research topics.
In Text2SQL, new prompting methods are being constantly conceived to achieve state of the art results.
Current well established methods include:
Language models accept a string as an input, and thus a Text2SQL task will be represented as a prompt.
To help the model understand the input and the desired task, a good representation strategy should be implemented.
Task: How many continents are there?
Prompt:
Table continents , columns = [ ContId , Continent ]
Table countries , columns = [ CountryId , CountryName , Continent ]
Q : How many continents are there ?
A : SELECT
If such a representation is used with non additional example, we say that the prompt is a Zero-shot prompt
While Text2SQL models can achieve considerable results with zero-shot prompting, including relevant examples generally improve drastically the performances of the model.
For that reason, many Few-shot strategies were developed to utilise the most of the given examples.
In Text2SQL, it is not practical to manually write examples for each prompt. For that reason, many example selection methods were designed.
Example Selection in itself is challenging, as it is not straightforward to find questions similar to the proposed one, and finding one in the same domain makes the task more harder.
Once the examples are select, their representation can influence the model's performance in both directions. For that reason a good representation of examples is needed.