Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named query generation #655

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Named query generation #655

wants to merge 7 commits into from

Conversation

GeoWill
Copy link

@GeoWill GeoWill commented Jul 3, 2024

This is now in the state I used it to generate the results for the 2024-07-04 General Election. So probably worth thinking about while it's still fresh-ish.

@GeoWill GeoWill force-pushed the named-query-generation branch 4 times, most recently from 9d68901 to 7a32792 Compare July 3, 2024 18:17
@GeoWill GeoWill force-pushed the named-query-generation branch from 7a32792 to 89feacc Compare July 11, 2024 07:57
raise argparse.ArgumentTypeError(msg)

def create_query_directory(self):
queries_dir = self.script_dir.parent / "queries"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The outstanding question I have is whether we want the queries directory in the gh repo or not. I think I err to 'yes', because it's nice to keep a record of these things, and it's easy to delete things in Athena. But maybe it should be 'no' until we're actually using CI (or some other automation to) to run the queries. This is because we'll have 2 sources of truth (athena and gh) until that's the case...

@GeoWill GeoWill self-assigned this Jul 11, 2024
@GeoWill GeoWill requested a review from symroe July 11, 2024 08:03
@symroe
Copy link
Member

symroe commented Aug 12, 2024

Not a review, but a reminder that we need to update the two API users CSV files on S3 (that are joined via a Glue table) before running queries. I think the default join is INNER, so API keys missing form the CSV files just get excluded from the resulting queries.

I have hacked a script in devs.DC locally to update the CSV. I need to commit this, or find another way to update the CSV file. We need to do the same for the EC API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants