You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@ChakshuGautam Please review the approach and suggest if any changes required Issue with Dimension Upsert
If we upsert the dimension data using /ingestion/dimension API , the data in the DB will be updated but that updated data will not be there in CSV file
So if we re run the yarn cli ingest then all the old data will be lost and we will not have updated data
Proposed Approach for Dimension Upsert
Create a API /ingestion/dimension, accepts csv file as input, the name of the csv file should be same as dimension name
Validates all the required columns for that table
If there is any error in column then sends the appropriate error and stops the ingestion
If there is any error in the records then those records will be added to error file and uploaded to ingestion_error
Else Insertion will be done on conflict updation will be done to the table
Once all the records are upserted we need to get all the records from the table, create a CSV file write all the records to the CSV file
Upload that CSV file to processed_input->dimensions folder
NOTE-> Assuming we will get data for all the columns for that dimension table except the auto generated ID column
The text was updated successfully, but these errors were encountered:
@ChakshuGautam Please review the approach and suggest if any changes required
Issue with Dimension Upsert
If we upsert the dimension data using /ingestion/dimension API , the data in the DB will be updated but that updated data will not be there in CSV file
So if we re run the yarn cli ingest then all the old data will be lost and we will not have updated data
Proposed Approach for Dimension Upsert
Create a API /ingestion/dimension, accepts csv file as input, the name of the csv file should be same as dimension name
Validates all the required columns for that table
If there is any error in column then sends the appropriate error and stops the ingestion
If there is any error in the records then those records will be added to error file and uploaded to ingestion_error
Else Insertion will be done on conflict updation will be done to the table
Once all the records are upserted we need to get all the records from the table, create a CSV file write all the records to the CSV file
Upload that CSV file to processed_input->dimensions folder
NOTE-> Assuming we will get data for all the columns for that dimension table except the auto generated ID column
The text was updated successfully, but these errors were encountered: