Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excecution Accuracy Metric definition incorrect (?) #177

Open
Skeftical opened this issue Dec 2, 2024 · 0 comments
Open

Excecution Accuracy Metric definition incorrect (?) #177

Skeftical opened this issue Dec 2, 2024 · 0 comments

Comments

@Skeftical
Copy link

Hello,

In the evaluation scripts the following piece of code is being used to generate scores for Execution Accuracy

def execute_sql(predicted_sql,ground_truth, db_path):
    conn = sqlite3.connect(db_path)
    # Connect to the database
    cursor = conn.cursor()
    cursor.execute(predicted_sql)
    predicted_res = cursor.fetchall()
    cursor.execute(ground_truth)
    ground_truth_res = cursor.fetchall()
    res = 0
    if set(predicted_res) == set(ground_truth_res):
        res = 1
    return res

Given that the retrieved result sets are turned into sets isn't this ignoring (a) DISTINCT errors and (b) incorrect row ordering errors.
For (a), given that the ground truth query might include a DISTINCT clause but the generated one will not then an EX of 1 is still assigned.
Example:

In [1]: res = [('apple',), ('pear',)]

In [2]: gen_res = [('apple',), ('apple',), ('pear',)]

In [3]: set(res) == set(gen_res)
Out[3]: True

For (b) given that the ground truth query might include a ORDER BY clause but the generated one will not then an EX of 1 is still assigned.

Example:

In [4]: res = [('apple',), ('pear',)]

In [5]: gen_res = [('pear',),('apple',)]

In [6]: set(res) == set(gen_res)
Out[6]: True
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant