-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: New API to replace existing arrays in npz files #68
Comments
Thanks for your proposal, I think having this feature would be really nice! Regarding the API, the boolean parameter |
I may fail to explain this part well: I was not proposing to change the existing After doing that, we can consider whether to make a change as you suggested on (I'm going to start implement the necessary bits.) PS: Switching to a |
Thanks for the clarification. Let's proceed in two steps then, first adding an overload to |
Currently,
dump_npz
either destroys all existing arrays in an npz file or append arrays with names that exist in the npz file as duplicated entries. This is a rather strange semantics, especially when loading individual arrays withload_npz
doesn't follow those semantics.A reasonable semantics would be replacing existing arrays. This requires a context object, with a role similar to
HighFive::File
.NumPy doesn't support this. They only overwrite all arrays at once.
I checked out libzippp and libzip++, neither work with streams. It's primarily because
dump_npy_stream
and libzip are both "push" interfaces (thus, both need to run the main loop), so they cannot work together without writing a special stream class that serves as a pipe so that libzip can pull data from it...I think there is a rather simple way to support replacing semantics given a context object. When an
npz_file
is opened for update, append as usual while keeping the central directory as a data structure in memory. When closing, write a temporary file with only the up-to-date arrays, finish the file with the central directory entry, do an atomic move to replace the old file. (It's possible to shrink a file in-place, but I guess that would invalidate too much I/O buffer).The text was updated successfully, but these errors were encountered: