The exact matching score is defined as the fraction of predictions that have an exact matching with the respective ground truth query.
Exact Matching was introduced as a refinement of String Matching on which commutativity of operators is used to reduce false negatives[1].
For each query, we will build its syntax tree. The syntax tree should be consistent with the construct of the SQL language.
A node is marked as green if its children can be rearranged without changing the overall query. Otherwise it is marked as red.
UNION
, INTERSECT
keywordsJOIN
, NATURAL JOIN
, OUTER JOIN
keywordsSELECT
, HAVING
keywords*
+
AND
OR
With the Green-Red decomposition, We will rearrange the syntax tree as follows:
def rearrange(tree):
for child in tree.children:
rearrange(child)
if colour(tree) == "green":
tree.children.sort(key=str)
The exact matching accuracy
rearrange
can be calculated in linearithmic time[2]. And so we can calculate the exact matching between two queries in linearithmic time. Which is considered efficient.