You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the evaluation scripts the following piece of code is being used to generate scores for Execution Accuracy
def execute_sql(predicted_sql,ground_truth, db_path):
conn = sqlite3.connect(db_path)
# Connect to the database
cursor = conn.cursor()
cursor.execute(predicted_sql)
predicted_res = cursor.fetchall()
cursor.execute(ground_truth)
ground_truth_res = cursor.fetchall()
res = 0
if set(predicted_res) == set(ground_truth_res):
res = 1
return res
Given that the retrieved result sets are turned into sets isn't this ignoring (a) DISTINCT errors and (b) incorrect row ordering errors.
For (a), given that the ground truth query might include a DISTINCT clause but the generated one will not then an EX of 1 is still assigned.
Example:
In [1]: res = [('apple',), ('pear',)]
In [2]: gen_res = [('apple',), ('apple',), ('pear',)]
In [3]: set(res) == set(gen_res)
Out[3]: True
For (b) given that the ground truth query might include a ORDER BY clause but the generated one will not then an EX of 1 is still assigned.
Example:
In [4]: res = [('apple',), ('pear',)]
In [5]: gen_res = [('pear',),('apple',)]
In [6]: set(res) == set(gen_res)
Out[6]: True
The text was updated successfully, but these errors were encountered:
Hello,
In the evaluation scripts the following piece of code is being used to generate scores for Execution Accuracy
Given that the retrieved result sets are turned into
sets
isn't this ignoring (a) DISTINCT errors and (b) incorrect row ordering errors.For (a), given that the ground truth query might include a
DISTINCT
clause but the generated one will not then an EX of 1 is still assigned.Example:
For (b) given that the ground truth query might include a
ORDER BY
clause but the generated one will not then an EX of 1 is still assigned.Example:
The text was updated successfully, but these errors were encountered: