Wild idea for "hi res" sampler? #599

morphles · 2023-05-01T19:03:34Z

morphles
May 1, 2023

This is probably way way out, and maybe just because I don't know enough, but still... (I also tried looking at code, but python is not my first lang, and have, particular problems with imperative code, which is also highly complex and unfamiliar, so did not get far for now [if ever]). So I need someone smarter/more knowledgeable to at least comment if I'm talking complete bs or could this be feasible somehow.

So as far as I understand how SD sort works, it gets noise, tries to predict "counter noise", that would move noise towards picture requested in prompt, does not get it on first, step, so applies what it guessed, and tries again. No there is that scheduling business, which as far as I was able to find, is basically how much noise it tries to eliminate at certain step in process. "whatever". I want high res pictures, and SD1.5, the one with all shitton of community models, well it's not great at it, and usual hires fix flow - I find it underwhelming (and spent probably way too much time in past couple weeks trying to build something better in comfyui, not really succeeding much, but maybe in some cases getting somewhat better pictures than usual). In any case, thinking about hi res fix, how samplers might work, and noise, and playing with BlenderNeko's unsampler etc. Got me thinking, couldn't sampler be written so that it "predicts across multiple scales"? (now I know latent is not strictly pixels, but this will be best way to explain). So suppose we have 8x8 noise pixel, and we need to predict how to move them from being noise towards being picture. Now it predicts (I'm somewhat guessing) "counter noise" for each pixel, all good, we have 8x8 counter noise. But now, we average our noise in 2x2 cells, and get 4x4 noise grid, and do same prediction on it. Now we need to combined this averaged 4x4 noise with full scale 8x8 noise, we can just average it (so for each 1 pixel of 8x8 predicted noise average it's val with one from 4x4 predicted noise, obviously using same 4x4 nose pixel value for 4 8x8 pixels). Now obviously we can't just do this always, as in the end we want fine details. So that would mean, that we do not do just straight averaging, and instead at starting steps - coarse/averaged predicted noise should weigh more, and as we approach end (or maybe even middle/just few steps), it should mostly use fine/usual noise.

Not the cleanest explanation, but hopefully it can be understood. My idea is that, one of the biggest problems with high res, is detachment of different areas, but if we can somehow "introduce correlations" via this averaged noise/prediction, it should work "like high res fix" just in one go, well actually I would think it should work much much better. Also it does not need to be just 1 level of averaging (which would just let to ~2x usual rendering size [well per dimension]). Of course this will introduce significant slow down, but again, might be worth it, if it works.

So can someone who knows technicalities more than me, explain if this is complete nonsense and can't work for some reason, or is there a chance it could somehow work?

BlenderNeko · 2023-05-01T22:37:08Z

BlenderNeko
May 1, 2023

mixing upscaled latents into a larger latent image to maintain composition information will artificially reduce the amount of gaussian noise in the image faster than the noise schedule accounts for. This is most likely going to get you blurry images. It's also not very different from your typical "highres" workflow where the desired composition is given by re-noising an upscaled image to add the finer details, except here the amount of noise is in accordance with the noise schedule.

0 replies

morphles · 2023-05-02T10:22:49Z

morphles
May 2, 2023
Author

The blur thing sounds plausible, though "in exhaustive". First, can't schedule be modified to account for mixing? Second if using ancestral samplers/those that add noise, I think it shouldn't be too big of a problem.

The typical hi res fix I find extremely underwhelming, as you have to "balance" (more like tradeof with both sides being kinda shitty), between no enough de-noise and thus you do not really get appreciable enhancement of detail, or too much and you get distortion/artefacts, almost as if you started with with high res image).

I have tried some weird flows to get better, high res results, and I think they manage to get somewhat better results (though easy to become biased here, and not sure how well it's objectively measurable) (oh I'm also not looking into how similar upscaled image is to original, I'm more interested in good high res image, that decently follows prompt, and low res image I take as "guiding" in between step).

So one flow is, upscale latient, then unsample it, (tried lots of settings, but seems euler, cfg1, and decent number of steps [for now I settled on 40] gives best results later on), then from unsampler go to ksampler (adv), don't add noise, use "appropriate" cfg (not sure about that, might depend on resulting image size and maybe other things), different sampler, almost certainly ancestral, different number of steps ~20 (in general I like ancestral samplers much more then on a, as they need much less steps, and maybe just give better results overal). And that's it, it's possible to get quite high res results like that, though still not great, as it still suffers from "incorrelation".
Another less involved flow is just upscale, go to ksampler (adv), add noise, but set start step at ~2 or so (ofc, ancestral). Again somehow seems to give possibly better results than usual denoise thing. Though I suspect in the end it should be quite similar, if I start at 2 out of 10, it probably is something like 0.8 denoise? Haven't tested this too much, but again seemed to be quite nice.

But all that still suffers as there is no "integration over scale". And I thin such "scale averaging sampling" could solve it, again with appropriate weighing of factors, "large scale" denoising being important at start, and then almost not at all near the end of sampling.

The unsample-resample flow, is a thing that makes me think this should work much better than regular, or mentioned flows. As it is clear that seed/initial noise has extremely strong impact on result. And when you do regular hi res, the noise introduced is unrelated to anything done previously, so it screws with whatever "preimage" original noise had. While you have just one initial noise, and operate on it across multiple different scale via averaging, it should have much less of "clash preimages" resulting in weird artifacts. Of course it would need tuning and figuring out good dropofs for factors and what not.

If somehow sampler code could be a bit less cryptic I could probably attempt it myself :). Well might be interested enough to try to figure it out as is, though seemed very complex.

1 reply

BlenderNeko May 2, 2023

First, can't schedule be modified to account for mixing?

If you can mathematically figure out how the noise and upscaling relate, maybe?, though it would probably involve injecting new noise, and I don't think this is an easy problem. (there might also be tricks to completely circumvent this though by using the upscaled latents in a different way)

Second if using ancestral samplers/those that add noise, I think it shouldn't be too big of a problem.

It still is.

Another less involved flow is just upscale, go to ksampler (adv), add noise, but set start step at ~2 or so (ofc, ancestral). Again somehow seems to give possibly better results than usual denoise thing.

This is a regular denoising workflow, you add noise to get to step 2 on the schedule and then denoise. If you have 10 steps in the advanced sampler, then this would be equivalent to doing .8 denoise and 8 steps on the normal sampler.

And when you do regular hi res, the noise introduced is unrelated to anything done previously, so it screws with whatever "preimage" original noise had.

Please note that the unsampler is quite brittle in this regard, you only end up in the same neighborhood if the prompt is relatively similar, cfg is comparable, and the endpoint is not anomalous for the model to begin with, otherwise your image might look drastically different.

While you have just one initial noise, and operate on it across multiple different scale via averaging, it should have much less of "clash preimages" resulting in weird artifacts.

I am not even sure if noise is scale invariant , i.e. whether the act of upscaling is going to land you in the same area or not.

Of course it would need tuning and figuring out good dropofs for factors and what not.

I fear you're describing the exact problem you started off with, namely:

The typical hi res fix I find extremely underwhelming, as you have to "balance" (more like tradeof with both sides being kinda shitty), between no enough de-noise and thus you do not really get appreciable enhancement of detail, or too much and you get distortion/artefacts, almost as if you started with with high res image).

morphles · 2023-05-02T11:59:09Z

morphles
May 2, 2023
Author

Hm, yeah scale invariant can end up being tricky, as it might end up having weird ass unintended consequences. Unsampler, I only use it with same promp, just change smapler names/cfg and steps, and generally it seems results are quite robust. I think I actually might sorta create workflow replicating this somehow using your nodes. Maybe...

One thing I do not understand, it seems that "image" and "noise" are stored as separate things in sampler? If so then possibly this complicates things.

Though still with slerp and inject noise, I could probably have a got at cobbling something, by doing single steps with ksampler (adv) and then fiddling with inject/slerp/upscale or what not. Though when I tried slerping for highres stuff, I think it mostly went quite not well.

Thanks for taking time to answer to my weird ass ideas!

15 replies

morphles May 3, 2023
Author

oh ffs forgot important question. Number of scheduler seems to be low, are thy very hard to add? Wouldn't it be possible to have sorta like parametric scheduler? https://stable-diffusion-art.com/samplers/#Noise_schedule going by this, with my limited understanding I see just bars approximating some curve, and if that curve could be made parametric to adjust steepnes/inflection point/direction, that might be interesting. I suspect for this hi res test node, less usual scheduling might be better, i.e. more noise removed towards the end. As uniform is working quite well with certain mix factor shapes. And from past tests I also know, that one can reach quite amazing result in say 8 steps, and ok rezults even in 5 (with sde sampler) even on high resolution image [tested with regular samplers, few weeks ago].

BlenderNeko May 3, 2023

Yeah go ham with it, I have some other stuff higher up on my todo list atm. from me very briefly trying out nearest, that does seem to work better yeah, though it's still hacky. I knew latents don't scale up well (using interpolation), but apparently they also don't scale down very neatly either. Like i guessed, not scale invariant. So if you wanted all of this to be somewhat more principled you'd have to do all of the scaling in pixel space instead of latent, but that would mean at least 1 roundtrip through the vae (to pixel for scaling and then back to latent), and possibly more.. each step.. quite a time sink, but maybe better results

Also, In the gist i linked, I only sync the highres up with the lowres, but that could mean that near the end of generation when you give the highres more freedom they can desync, so it might be a good idea to downscale the modified highres each step to act as the new lowres?

If you have an up to date comfy you can give the sampler it's own schedule, like i do here but i wouldn't really recommend trying your own schedules willy nilly? the shapes are not arbitrary.

morphles May 4, 2023
Author

Nah vae roundtrip I think is complete no go, first because it's slow as heck, I can have setups where sampling even this one that does sorta "duble samplining" can take close to vae step (that is not signidicantly longer). Second noise is latent, and when you decode it, it some blurry mess generally. So I somehow doubt it it would even work that well, if bilinear transformation result in some funk (though not always), vae ones would be even weirder.

I'll try to see in understand that sync of noises, and see what I can do.

Interesting, will see maybe I'll be able to figure something out there, though that line does not shine too much light :).

I'm still quite happy as in general seems it's able to produce (subjectively?) nicer results, even with being too fancy, so I think ceiling is not reached yet. Will still try to push it some more, to maybe get more objectively impressive results before I start flashing it around the net :).

ltdrdata May 4, 2023
Collaborator

I support your experiment. I hope that a new hi-res method can be developed with good experimental results.

morphles May 4, 2023
Author

Ah understood that missing downscale. Will try and see what happens with it added.

morphles · 2023-05-08T10:10:24Z

morphles
May 8, 2023
Author

So in case anyone wants some updated. I think I have reworked the original implementation into 2x res sampler, that works decently, and although I'm no longer as certain that it is significantly better than regular high res, I'll probably create custom node repo sometime this week. Some things are not explored with it yet, i.e. using it instead of regular sampler in high res fix workflow, tried just a little bit, and it probably can be made to work with certain params. Not yet sure, and overall params need to be tweaked based on prompt/model, which I guess is not our of ordinary/to be expected.
Spending quite some time on 4x version, which basically layers that approach, not even ok results yet, though still convinced there is some algo and param set that would give very nice results, just from what I'm seeing. One thing is with that bilinear, it has super weird unexpected, and not bad effects, and that likely needs to be explored. Will provide more details/examples sometime later.

0 replies

morphles · 2023-05-14T14:05:49Z

morphles
May 14, 2023
Author

Hey guys, so I finally published repo with what I was experimenting on: https://github.com/morphles/xSamplers showcase is kinda sucky (since I'm kinda tired from testing all kinds of stuf, but hopefully it will be interesting enough at least to someone), and I'll probably revisit mixing curves a bit. Would be nice if you could check if it works at all for you. And maybe help with progress bar, as it currently does not work. I intend to post it on reddit too later today.

0 replies

morphles · 2023-05-15T17:04:14Z

morphles
May 15, 2023
Author

Well seems interest here waned :). Frankly a bit for me too, though I see that there likely are decent possibilities with this method, and even bilinear scaling works, and works well with different curve params. So there are more avenues to explore, but if no one shows interest even after reddit post, I'll likely not pursue too much further.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wild idea for "hi res" sampler? #599

{{title}}

Replies: 6 comments 16 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Wild idea for "hi res" sampler? #599

morphles May 1, 2023

Replies: 6 comments · 16 replies

BlenderNeko May 1, 2023

morphles May 2, 2023 Author

BlenderNeko May 2, 2023

morphles May 2, 2023 Author

morphles May 3, 2023 Author

BlenderNeko May 3, 2023

morphles May 4, 2023 Author

ltdrdata May 4, 2023 Collaborator

morphles May 4, 2023 Author

morphles May 8, 2023 Author

morphles May 14, 2023 Author

morphles May 15, 2023 Author

morphles
May 1, 2023

Replies: 6 comments 16 replies

BlenderNeko
May 1, 2023

morphles
May 2, 2023
Author

morphles
May 2, 2023
Author

morphles May 3, 2023
Author

morphles May 4, 2023
Author

ltdrdata May 4, 2023
Collaborator

morphles May 4, 2023
Author

morphles
May 8, 2023
Author

morphles
May 14, 2023
Author

morphles
May 15, 2023
Author