Solution | Language Model | Training Strategy | Size | Approach |
---|---|---|---|---|
Dail-SQL | Any | Both* | * | Prompt Pipeline |
CodeS | StarCode | Pretraining + Both | 1B/3B/7B/13B | Complex Pipeline |
DTS-SQL | DeepSeek / Mistral |
SFT | 3B/7B | Submodeling |
RESDSQL | RoBERTa + T5 |
SFT | 220M/770M/3B | Customizing Encoder/Decoder |
Graphix | Variant of T5 | SFT | 770M/3B | Graph Neural Network |
RASAT | Variant of T5 | SFT | 60M/220M/770M/3B | Graph Neural Network |
To improve a model's performance, some strategies were used, such as:
Strategy | Idea |
---|---|
Schema-Linking | Pruning Unnecessary tables and columns from the schema |
Self-Consistency | Extracting the most used result from different executions along different paths. |
Chain of Thoughts | Enables complex reasoning capabilities through intermediate reasoning steps |
PICARD | Using a beam search to improve to eliminate syntactic and semantic errors. |
Some models use prompting methods on LLMs. They are generally agnostic to the LLM itself, and so offer a loose coupling between the prompting and the LLM.
For that, we will compare different LLMs based on different criteria.
Model | License | Parameters | Scope | Available Variants |
---|---|---|---|---|
GPT3 | Proprietary | 0.35B/1.3B/6.7B/175B | General | Code, Instructions |
GPT3-Turbo | Proprietary | Undisclosed | General | Code, Instructions |
GPT4 | Proprietary | Undisclosed[1] | General | |
Llama | 6B/13B/33B/65B | General | N/A | |
Llama 2 | MIT[1-1] | 7B/13B/34B/70B | General | Code, Python, Instructions |
Alpaca | 7B | N/A | ||
Vicuana | 7B/13B | |||
Phi-1 | MIT | 1.3B | Python | |
Phi-1.5 | MIT | 1.3B | General | |
Phi-2 | MIT | 2.7B | General | Code |
Gemini-Nano | Proprietary | 1.8B/3.25B | General | |
Gemini(Pro/Ultra) | Proprietary | Undisclosed | General |
Model | Source | API |
---|---|---|
GPT3 | Unavailable | Retired |
GPT3-Turbo | Unavailable | Paid |
GPT4 | Unavailable | Paid |
Llama | Available | Free |
Llama 2 | Available | Free |
Alpaca | Available | |
Vicuana | Available | |
Phi-1 | Available | Free |
Phi-1.5 | Available | Free |
Phi-2 | Available | Free |
Gemini-Nano | Unavailable | Free[2] |
Gemini(Pro/Ultra) | Unavailable | Free[2-1] |
Model | Input Price | Output Price ($ / 1M Tokens) |
---|---|---|
GPT4 | 10$ / 1M Tokens[3] | 30 |
GPT3-Turbo | 5$ / 1M Tokens | 15 |
Gemini-Nano | 0.125$ / 1M Characters | 0.375$ / 1M Character |
Gemini(Pro/Ultra) | 0.125$ / 1M Characters | 0.375$ / 1M Character |