Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zip: find preambles for zip files written by archive/zip #24

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions zip/preamble.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,14 @@ func Preamble(path string) (*PreambleReader, error) {
// when the Reader is created, so this is the fastest (and easiest) way to find out where the zip
// data begins in the underlying file.
zipOffset := reflect.ValueOf(zr).Elem().FieldByName("baseOffset").Int()
// zip files written by archive/zip's Writer correctly report the byte offsets of their CDFH and
// EOCD entries, but this means the baseOffset calculated by the Reader will always be 0, even when
// non-zip data is prepended. We can detect this based on the reported byte offset of the zip
// header for the first file in the archive - for files that truly contain only zip data, this
// should also be 0. If it isn't, assume everything before the first file header is the preamble.
if zipOffset == 0 && len(zr.File) != 0 {
zipOffset = reflect.ValueOf(zr.File[0]).Elem().FieldByName("headerOffset").Int()
}
log.Debugf("%s: zip data begins at byte offset %d", path, zipOffset)
zr.Close()
f, err := os.Open(path)
Expand Down
8 changes: 8 additions & 0 deletions zip/preamble_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,14 @@ func TestPreamble(t *testing.T) {
ZipFile: "zip_preamble.zip",
PreambleChecksum: "038a57f3f807fa91bdd30239b9711fccf0d782fe2f036e03211852237e94d24c", // another.zip
},
{
ZipFile: "arcat_preamble.zip",
PreambleChecksum: "46533b2dfa35ad537d3561ebee0c7af8941bc65363c1b188e1be6eaf79e9138c", // shebang_twolines.txt
},
Comment on lines +33 to +36
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This zip file was written by arcat (i.e. by archive/zip), whereas the other test zip files here were simply catted onto another file acting as the preamble. This meant that the byte offsets of the CDFH and EOCD entries in those zip files were wrong, so the presence of the preamble was easily detectable by the previous version of Preamble. The main use case for the --preamble_from option is in fact to write the shebang from one pex file into another pex file in the python-rules plugin, so it's important that preambles can be read correctly from zip files written by arcat.

{
ZipFile: "empty.zip",
PreambleChecksum: "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", // empty string
},
} {
t.Run(tc.ZipFile, func(t *testing.T) {
r := require.New(t)
Expand Down
Binary file added zip/test_data_4/arcat_preamble.zip
Binary file not shown.
Binary file added zip/test_data_4/empty.zip
Binary file not shown.
Loading