Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zip: find preambles for zip files written by archive/zip #24

Conversation

chrisnovakovic
Copy link
Contributor

zip files written by archive/zip's Writer correctly report the byte offsets of their central directory file header and end of central directory entries, but this means the offset of the zip data within the file as calculated by archive/zip's Reader will always be 0 - in other words, Reader thinks that any non-zip data prepended to the file is actually part of the first file's local file header. This prevents any zip file written by archive/zip from being used as a source for the zip command's --preamble_from option, because the non-zip data doesn't appear to be non-zip data at all.

In cases where Reader identifies that the byte offset of the zip data within the file is 0, check whether the byte offset of the header for the first local file header is also 0. If it isn't, assume the byte offset of the first local file header is the true starting position of the zip data within the file, and that anything before it is in fact a preamble.

@chrisnovakovic chrisnovakovic force-pushed the zip-preamble_from-archive-zip-fix branch from f6bbcb3 to 07b2baa Compare October 24, 2024 22:08
zip files written by `archive/zip`'s `Writer` correctly report the byte
offsets of their central directory file header and end of central
directory entries, but this means the offset of the zip data within the
file as calculated by `archive/zip`'s `Reader` will always be 0 - in
other words, `Reader` thinks that any non-zip data prepended to the file
is actually part of the first file's local file header. This prevents
any zip file written by `archive/zip` from being used as a source for
the `zip` command's `--preamble_from` option, because the non-zip data
doesn't appear to be non-zip data at all.

In cases where `Reader` identifies that the byte offset of the zip data
within the file is 0, check whether the byte offset of the header for
the first local file header is also 0. If it isn't, assume the byte
offset of the first local file header is the true starting position of
the zip data within the file, and that anything before it is in fact a
preamble.
@chrisnovakovic chrisnovakovic force-pushed the zip-preamble_from-archive-zip-fix branch from 07b2baa to 9d97356 Compare October 24, 2024 22:09
Comment on lines +33 to +36
{
ZipFile: "arcat_preamble.zip",
PreambleChecksum: "46533b2dfa35ad537d3561ebee0c7af8941bc65363c1b188e1be6eaf79e9138c", // shebang_twolines.txt
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This zip file was written by arcat (i.e. by archive/zip), whereas the other test zip files here were simply catted onto another file acting as the preamble. This meant that the byte offsets of the CDFH and EOCD entries in those zip files were wrong, so the presence of the preamble was easily detectable by the previous version of Preamble. The main use case for the --preamble_from option is in fact to write the shebang from one pex file into another pex file in the python-rules plugin, so it's important that preambles can be read correctly from zip files written by arcat.

@chrisnovakovic chrisnovakovic merged commit 5462d2a into please-build:master Oct 25, 2024
2 checks passed
@chrisnovakovic chrisnovakovic deleted the zip-preamble_from-archive-zip-fix branch October 25, 2024 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants