Skip to content

Commit

Permalink
Update file decompression to work with Windows
Browse files Browse the repository at this point in the history
  • Loading branch information
eldon-tuva committed Jun 21, 2024
1 parent a493ed6 commit 932c0dc
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 3 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This script will load the Tuva Health seed data from AWS S3 into a Postgres database.

This script assuems you already have your AWS environment variables set.
This script assumes you already have your AWS environment variables set.

To use please create a `config.yml` file. You can use the `config.yml.example` as a template.

Expand Down
10 changes: 8 additions & 2 deletions s3-to-postgres.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
#!/usr/bin/env python3

import yaml
import re
import boto3
import os
import gzip
import shutil
import psycopg2
import glob
import csv
Expand Down Expand Up @@ -99,8 +103,10 @@ def load_files_to_postgres(download_dir, pg_connection_string, s3_paths, headers

local_csv_path = file_path.replace('.gz', '')

# Decompress the file
os.system(f'gunzip -c {file_path} > {local_csv_path}')
# Decompress the file using gzip
with gzip.open(file_path, 'rb') as f_in:
with open(local_csv_path, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)

# Use COPY to load data with predefined headers
copy_query = f"COPY {full_table_name} ({columns}) FROM STDIN WITH CSV DELIMITER ','"
Expand Down

0 comments on commit 932c0dc

Please sign in to comment.