Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-reproducible should imply -mkfs-time $constant #140

Open
scgtrp opened this issue Dec 25, 2021 · 6 comments
Open

-reproducible should imply -mkfs-time $constant #140

scgtrp opened this issue Dec 25, 2021 · 6 comments
Assignees
Milestone

Comments

@scgtrp
Copy link

scgtrp commented Dec 25, 2021

If I create the same image twice, with -reproducible, without changing the contents of the source directory:

$ mksquashfs rootfs/ a -all-root -noappend -comp xz -reproducible 
$ mksquashfs rootfs/ b -all-root -noappend -comp xz -reproducible 

I would expect to get two identical output files. However, they are not:

$ sha1sum a b
7204f4037cd840de1bf5db259abd8b0170518308  a
4e9caf6d8cb632284ea521c51d8bbcb866d7e8b8  b

The only difference between the two is the timestamp in the filesystem superblock, which is set to the current time even in -reproducible mode:

$ diff -u <(hexdump -C a) <(hexdump -C b)
--- /dev/fd/63	2021-12-25 04:46:55.789247153 -0500
+++ /dev/fd/62	2021-12-25 04:46:55.789247153 -0500
@@ -1,4 +1,4 @@
-00000000  68 73 71 73 97 01 00 00  5c e7 c6 61 00 00 02 00  |hsqs....\..a....|
+00000000  68 73 71 73 97 01 00 00  60 e7 c6 61 00 00 02 00  |hsqs....`..a....|
 00000010  04 00 00 00 04 00 11 00  c0 00 01 00 04 00 00 00  |................|
 00000020  bd 1c fa 05 00 00 00 00  38 8d 36 00 00 00 00 00  |........8.6.....|
 00000030  30 8d 36 00 00 00 00 00  ff ff ff ff ff ff ff ff  |0.6.............|

It seems that -reproducible should, in addition to all the other things it keeps constant, replace the timestamp with a constant value (0? 0xFFFFFFFF?).

(Also, happy whatever winter holiday(s) you celebrate! I feel weird submitting this today because of the "Merry Christmas! I got you a bug report!" thing but if I don't do it right now it'll slip my mind and never get done.)

@roeey777
Copy link

roeey777 commented Dec 25, 2021

I completely agrees with @scgtrp in addition I would suggest that -reproducible flag would imply the usage of SOURCE_DATE_EPOCH environment variable as described by reproducible-builds.org.

@scgtrp
Copy link
Author

scgtrp commented Dec 25, 2021

I was unaware of that, but having read it, I think I agree. You could accomplish the same thing with -mkfs-time $SOURCE_DATE_EPOCH, but if it's clear the user meant that, might as well just do it.

I'd propose something like "use the first available of -mkfs-time, $SOURCE_DATE_EPOCH (maybe only if reproducible?), current time (only if !reproducible), 0". That seems like the least surprising behavior.

@plougher plougher self-assigned this Feb 22, 2022
@plougher plougher added this to the Undecided milestone Feb 22, 2022
@deven
Copy link

deven commented Feb 23, 2024

How about defaulting -mkfs-time to the newest timestamp of any inode in the created filesystem when using -reproducible? That would be inherently stable if the source content is unchanged.

@nanonyme
Copy link

nanonyme commented Nov 6, 2024

How about defaulting -mkfs-time to the newest timestamp of any inode in the created filesystem when using -reproducible? That would be inherently stable if the source content is unchanged.

That has heavy implications on everyone's build systems preserving mtimes and that no one packages directly from git. People who have been wary of using SOURCE_DATE_EPOCH directly have been historically adding their own project-specific environment variables that distributors can opt-in to. That's pretty simple to do.

@deven
Copy link

deven commented Nov 7, 2024

That has heavy implications on everyone's build systems preserving mtimes and that no one packages directly from git. People who have been wary of using SOURCE_DATE_EPOCH directly have been historically adding their own project-specific environment variables that distributors can opt-in to. That's pretty simple to do.

This has nothing to do with git or build systems. Creating a reproducible SquashFS filesystem requires stable source file mtimes in the first place, so there is no opportunity to confuse a build system. Using the newest timestamp from the source files would be a more intelligent default for -mkfs-time when -reproducible is used, since using the current timestamp inherently breaks the intended reproducibility.

@nanonyme
Copy link

nanonyme commented Nov 9, 2024

This has nothing to do with git or build systems. Creating a reproducible SquashFS filesystem requires stable source file mtimes in the first place, so there is no opportunity to confuse a build system. Using the newest timestamp from the source files would be a more intelligent default for -mkfs-time when -reproducible is used, since using the current timestamp inherently breaks the intended reproducibility.

I mean, many are using squashfs tooling inside a buildsystem to create redistributable app images. In fact entire snap format works like this. I did however understand from your comment I was thinking inside too small box. The -mkfs-time parameter indeed sounds like flexible enough mechanism and your default is reasonable as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants