The exact matching score is defined as the fraction of predictions that have an exact matching with the respective ground truth query.
Exact Matching was introduced as a refinement of String Matching on which commutativity of operators is used to reduce false negatives[1].
For each query, we will build its syntax tree. The syntax tree should be consistent with the construct of the SQL language.
A node is marked as green if its children can be rearranged without changing the overall query. Otherwise it is marked as red.
UNION, INTERSECT keywordsJOIN, NATURAL JOIN, OUTER JOIN keywordsSELECT, HAVING keywords*+ANDORWith the Green-Red decomposition, We will rearrange the syntax tree as follows:
def rearrange(tree):
for child in tree.children:
rearrange(child)
if colour(tree) == "green":
tree.children.sort(key=str)
The exact matching accuracy
rearrange can be calculated in linearithmic time[2]. And so we can calculate the exact matching between two queries in linearithmic time. Which is considered efficient.