String Matching

Quick-start

It is the most naive comparison function. it calculates the fraction of predictions that are exactly identical with the ground truth query.

Formula

The String matching accuracy of between and is defined as follows:

Advantages

  • It is by far the easiest matching criteria to implement.
  • Calculating is very efficient.

Short-comings

This metric is too restrictive, as it penalises on any difference, even if the prediction is still equivalent to the ground truth query.

Example

Let:

  • be the question: Find the person who has the same first name as last name
  • the database defined by:
    CREATE TABLE PERSON(
    ID int primary key,
    FIRST_NAME VARCHAR(20),
    LAST_NAME VARCHAR(20)
    )
    
  • Let be the ground-truth query:
SELECT FIRST_NAME,LAST_NAME from PERSON where FIRST_NAME==LAST_NAME;

Suppose that the predicted query is:

 SELECT FIRST_NAME,LAST_NAME from PERSON where LAST_NAME==FIRST_NAME;

Then, even both queries are semantically equivalent, the prediction is considered incorrect with respect to