Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove methane-farms collection from staging #116

Closed
3 tasks done
anayeaye opened this issue Apr 25, 2024 · 4 comments
Closed
3 tasks done

Remove methane-farms collection from staging #116

anayeaye opened this issue Apr 25, 2024 · 4 comments
Assignees

Comments

@anayeaye
Copy link
Contributor

anayeaye commented Apr 25, 2024

What

"Methane Emissions Manure Management (methane-farms)" needs deleted or removed.

This issue is to manage the probably small task of removing a bit of metadata from the staging catalog but is also an opportunity to start discussing VEDA data lifecycle/sunsetting.

Notes

This is a good opportunity to start discussing sunsetting data in terms of the data life cycle. Here are some initial thoughts about how this process could look:

  1. Open veda-data issue so that we have a trail for the removal
  2. Is this collection tied to a dashboard dataset config?-->If yes, that dataset mdx needs to be removed first along with any stories that use that dataset
  3. Are there any documents/notebooks for this collection? Ctrl-f in for the collection id veda-docs (https://github.com/NASA-IMPACT/veda-docs)
  4. Are we only removing database records, or are there S3 objects to remove?
  • Yes we're removing objects:
    • Are these objects assets of any other collections? It is possible to have hrefs from different collections pointing to the same file in s3. If another collection uses objects do not delete them.
    • If not referenced in other collections OK to delete or change the s3 object lifecycle policy (might be able to infer this from veda-data/data/input_config https://github.com/NASA-IMPACT/veda-data/tree/main/ingestion-data/discovery-items ctrl-f bucket pattern. NOTE if this is going to be a large number of objects, consider editing the lifecycle policy instead: it can be more cost effective to set an expiry date on objects than to perform s3 delete.
  1. Remove database records: PgStac implements a cascading delete so we only need to delete the collection and all child/item records will also be deleted. (For example https://dev.openveda.cloud/api/ingest/docs#/Collection/delete_collection_collections__collection_id__delete)

AC

  • Confirm that not downstream users/front ends are impacted
  • Check if S3 objects should be removed and delete objects IF APPLICABLE
  • Remove staging methane-farms collection from staging stac catalog
@anayeaye
Copy link
Contributor Author

Related veda-config branch: https://github.com/NASA-IMPACT/veda-config/tree/nc-hogs

@anayeaye
Copy link
Contributor Author

anayeaye commented May 8, 2024

Note:
Looks like s3://veda-data-store-staging/hog-farms/express_methane_cog_2020.tif is now referenced in a new methane-manure collection so these two s3 objects do should not be deleted. I will circle back and double check the collection but this should be a very simple ingest-api delete operation to close.

When we use the airflow transfer to production DAG the COGs will be stored in a bucket matching the collection name.

@anayeaye anayeaye self-assigned this May 8, 2024
@anayeaye
Copy link
Contributor Author

anayeaye commented May 8, 2024

After determining that we are keeping the objects, I found no other dependencies on the methane-farms collection and I have now deleted it.

From veda-config it looks like the replacement collection is methane-manure which is unfortunately invalid. I will reach out to see if we can support getting the collection stac metadata corrected or if it is still in the testing stage.

@anayeaye
Copy link
Contributor Author

Also here is a forward PR to fix up the replacement collection: #120

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant