Graphix-T5[1] is a state of the art GNN[2] model for Text2SQL. It is currently the open source model with the best exact match on spider.
It modifies the T5[3] encoder by defining GNN layers.
The reader may need a basic knowledge of Graph Theory concepts to understand the construction process.
Graphix-T5 encodes all question tokens and all table and columns in a graph. In that graph, edges denote relation between different words, columns and tables.
Fundamentally, the graph can be divided into 3 main components:
Given a database schema
Given a question
The input graph
Graphix-T5 incorporate two layers:
Graphix-T5 is essentially a modification of the T5 architecture.
For that reason, a descent understanding on Transformers in general and T5 in particular is recommended for the subsequent sections.
The semantic representations of hidden states are firstly encoded by a Transformer block, which contains two important components, including Multi-head Self-attention Network (MHA) and Fully-connected Forward Network (FFN).
Attention layer maps query matrix
MHA calculates the attention outputs for each head and concatenate them as following:
With
Finally, the semantic values are extracted with an additional row-wise
In each Graphix Layer, structural representations are produced through the relational graph attention network, it is formalised as follows: