-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Named query generation #655
base: main
Are you sure you want to change the base?
Conversation
9d68901
to
7a32792
Compare
7a32792
to
89feacc
Compare
raise argparse.ArgumentTypeError(msg) | ||
|
||
def create_query_directory(self): | ||
queries_dir = self.script_dir.parent / "queries" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The outstanding question I have is whether we want the queries
directory in the gh repo or not. I think I err to 'yes', because it's nice to keep a record of these things, and it's easy to delete things in Athena. But maybe it should be 'no' until we're actually using CI (or some other automation to) to run the queries. This is because we'll have 2 sources of truth (athena and gh) until that's the case...
Not a review, but a reminder that we need to update the two API users CSV files on S3 (that are joined via a Glue table) before running queries. I think the default join is INNER, so API keys missing form the CSV files just get excluded from the resulting queries. I have hacked a script in devs.DC locally to update the CSV. I need to commit this, or find another way to update the CSV file. We need to do the same for the EC API. |
This is now in the state I used it to generate the results for the 2024-07-04 General Election. So probably worth thinking about while it's still fresh-ish.