You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add CBS data to an S3 bucket - create a repo for provider code and for each year.
Import cbs data from email and upload to AWS - nowdays importmail process is running once a week and uploads to s3 data from last 2 cbs emails
Trigger Load CBS data routinely, once a week
command to delete a certain year and load starting that year, for example 2019: python main.py process cbs --source s3 --load_start_year=2019
cbs parser is in file anyway/parsers/cbs/executor.py
delete only data starting the year of the current files that arrived
Create DB table versioning of emails we load from email to s3, and load new data to s3 ONLY when new email data arrives
While loading data from email to s3, detect what years are loaded, and use the earliest year as load_start_year in the triggering of CBS loading. You may use an additional table.
Modify schedule from weekly back to daily, since CBS data will be loaded only when a new email arrives (see this pr that changed from daily to weekly)
Note that CBS processes are now in anyway-etl repo - see process repo here
The text was updated successfully, but these errors were encountered:
Modify CBS weekly data collection script
Current Airflow is here
command to delete a certain year and load starting that year, for example 2019:
python main.py process cbs --source s3 --load_start_year=2019
cbs parser is in file anyway/parsers/cbs/executor.py
delete only data starting the year of the current files that arrived
Note that CBS processes are now in anyway-etl repo - see process repo here
The text was updated successfully, but these errors were encountered: