Temporarily disable saving clean tags to disk #2557
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes
Fixes #2556 by @obulat
Description
This PR adds a check for the
field
when saving the cleaned up data. We skiptags
so that the disk space is not filled up during the data refresh. The tags are still cleaned before saving them to the database.Testing Instructions
I used the csv data from the gist in this PR description: from #904 . The sample data loading has been updated to check for document count in the index, and to test locally I disabled those checks (because there are only 1900 images in that gist).
When I ran
just init
, I saw the files in the/ingestion_server
folder, withouttags.csv
.Checklist
Update index.md
).main
) or a parent feature branch.Developer Certificate of Origin
Developer Certificate of Origin