-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFE: support for "output only" builds #183
Comments
At least for this one, that sounds like a bug. build_only layers shouldn't be re-built unless something in their set of {layer definition, base layer, imports} changes.
This sounds like a per-layer cache choice, which seems a little confusing to me. I'd rather just implement a But to the meat of your proposal: it might be nice to have a /stacker-output or something where people can write stuff and it'll "automagically" show up somewhere else (probably the host rootfs, but then it would be easier to import). Or did you have something else in mind? |
right. That works correctly. But when I change the
I don't think it is a per-layer cache choice. I'm not saying that I want the layer cached or not (I do), but the only content I care about is in /output. That is all that needs to be saved.
I had considered that as a separate feature request. Definitly copying content out of the container is something that would be useful. Currently the only real way to do that is with In this particular case that would only complicate things, though. Right now I'm telling stacker (via import i think what I'm asking is different though. I'm just wanting to provide information to stacker that it's cache does not need to keep the full layer, but only the specific locations. Normally stacker has to keep the entire layer, as it might be used via |
It is a per-layer cache, since other layers would have their rootfses cached, but this special kind would not. Worse yet, it's only certain directories that would be cached. I think if you're concerned about disk space, you don't want any of the layers cached, so we should implement an option for that if this is a concern. selective caching is a recipe for bugs and confusion, IMO.
I guess I don't understand why this is important. Either you care about disk space or you don't, and if you don't, why not cache the extracted rootfs so you don't have to burn the cpu to re-extract it again in the future? |
yeah, using |
I just don't think its a boolean. You might say I don't care about water usage because I have a leaky faucet, but I don't just leave the sprinkler running all day long. I have lots of these little "build stuff" layers. They're very useful. Correct me if I'm wrong, but as example, say I have 10 "build stuff" layers and 1 "assemble stuff". Each of the "build stuff" layers build My cache would then cost me 200G (10 build-stuff layers * 20G). That can quickly fill up my NVME when all that was necessary to cache was the output of those build-stuff layers, which is 1G (10 * 100M). If you change the number of "build stuff" layers from 10 to 100 the cache would become infeasible. |
Yes, this is true if you have 10 layers each with a different base and different layer hashes that total 20G each. But I doubt that's really the case; if you have one big "this is my build env" image, presumably all the "build stuff" layers use this same base that data will be shared. Put another way, a hash is present in two different base layers, the data is not duplicated in stacker's cache.
I'm not sure I want to write and maintain a bunch of code that is worried about users who want to take as input 2TB of different bits. gunzipping 2TB (assuming 50 MB/s gunzip rate) would take 11 hours, which I'm sure people will also complain about, and want caching for. |
Recently in stacker usage, I've been building a lot of "output only" layers. These containers build something and write its output to /output , and then are used by other layers in an import.
See example stacker.yaml below. The benefit of this approach is:
The notably significant cost of this approach is space. the cached of 'build-stubby' will contain the whole build filesystem where only the /output are ever needed. In this example / would be on the order of ~700M (enough for c toolchain) but /output would be on the order of 1M or less.
One potential solution to this would be to allow the layer definition to define
output-dir
. The layer build process would then remove all other directories after build. It could potentially do some clever tricks with mounts or tmpfs to avoid a big/slow 'rm' of / at the end of the layer build.The text was updated successfully, but these errors were encountered: