-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running your inference.py demo does NOT produce anything close to a good result. #36
Comments
"At a minimum" no... don't even... |
im getting an output but its just noise.. wonderinf if there is a VAE or something im missing |
It takes up about 37.6GB on my A6000 to run a 768x512/24p/121frame/40steps video. Reducing resolution helps some, but going to 384x256 only knocked that down to a bit over 31GB. That's a lot - but not unusual for running modern, leading edge models. I tested this https://github.com/KT313/LTX_Video_better_vram/tree/test - which converts the UNET model to bfloat16 which dropped GPU ram consumption to 22.2GB - so you should be able to get that running on your 4090. (I don't notice any meaningful quality difference.) If you need it to run in a lot less, that's another research project. This works great for me, though - there aren't any issues with it generating descriptions that match the examples. It took no time to add a gradio app to it. You can try running mine with the gradio app and the text encoder unloading by pulling down this fork/branch: https://github.com/eoffermann/LTX-Video/tree/gradio - but if it's not working for you at all (or is producing really poor results) you may have other problems. |
I don't think you need a system spec that high to run the model anyways, especially if you are doing 768x512 generation. The DiT and VAE can be run with under 6GB VRAM for 97 frames, the VRAM requirement is pretty much due to the T5 encoder. However, there already exists quantized T5 encoder for use (actually the T5 this model uses is the same as the one used in Flux, and there are already a slew of gguf ready to be used on huggingface). Not pretty sure if this particular repo will receive such update to utilize those models, but if you would like to try it out, using ComfyUI is the current way to go. Some nuts and bolts can be found on my other comment here: #4 (comment) |
Running your inference.py demo does NOT produce anything close to a good result.
Also, there is no way inference.py runs on my 4090. It OOM's so I need to enable_model_cpu_offload() and other things to even get it to run.
At a minimum there should be a stand alone py demo which runs in 24GB's and produces a result like all those being shown.
The text was updated successfully, but these errors were encountered: