Running your inference.py demo does NOT produce anything close to a good result. #36

aifartist · 2024-11-27T19:00:12Z

Running your inference.py demo does NOT produce anything close to a good result.
Also, there is no way inference.py runs on my 4090. It OOM's so I need to enable_model_cpu_offload() and other things to even get it to run.

At a minimum there should be a stand alone py demo which runs in 24GB's and produces a result like all those being shown.

jpgallegoar · 2024-11-28T14:51:43Z

"At a minimum" no... don't even...

gjnave · 2024-11-30T19:36:31Z

im getting an output but its just noise.. wonderinf if there is a VAE or something im missing

eoffermann · 2024-12-05T15:53:12Z

It takes up about 37.6GB on my A6000 to run a 768x512/24p/121frame/40steps video. Reducing resolution helps some, but going to 384x256 only knocked that down to a bit over 31GB. That's a lot - but not unusual for running modern, leading edge models.

I tested this https://github.com/KT313/LTX_Video_better_vram/tree/test - which converts the UNET model to bfloat16 which dropped GPU ram consumption to 22.2GB - so you should be able to get that running on your 4090. (I don't notice any meaningful quality difference.)

If you need it to run in a lot less, that's another research project.

This works great for me, though - there aren't any issues with it generating descriptions that match the examples. It took no time to add a gradio app to it.

You can try running mine with the gradio app and the text encoder unloading by pulling down this fork/branch: https://github.com/eoffermann/LTX-Video/tree/gradio - but if it's not working for you at all (or is producing really poor results) you may have other problems.

able2608 · 2024-12-05T23:31:36Z

I don't think you need a system spec that high to run the model anyways, especially if you are doing 768x512 generation. The DiT and VAE can be run with under 6GB VRAM for 97 frames, the VRAM requirement is pretty much due to the T5 encoder. However, there already exists quantized T5 encoder for use (actually the T5 this model uses is the same as the one used in Flux, and there are already a slew of gguf ready to be used on huggingface). Not pretty sure if this particular repo will receive such update to utilize those models, but if you would like to try it out, using ComfyUI is the current way to go. Some nuts and bolts can be found on my other comment here: #4 (comment)

Shecht-ltx assigned yoavhacohen Jan 7, 2025

Shecht-ltx closed this as completed Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running your inference.py demo does NOT produce anything close to a good result. #36

Running your inference.py demo does NOT produce anything close to a good result. #36

aifartist commented Nov 27, 2024

jpgallegoar commented Nov 28, 2024

gjnave commented Nov 30, 2024

eoffermann commented Dec 5, 2024 •

edited

Loading

able2608 commented Dec 5, 2024

Running your inference.py demo does NOT produce anything close to a good result. #36

Running your inference.py demo does NOT produce anything close to a good result. #36

Comments

aifartist commented Nov 27, 2024

jpgallegoar commented Nov 28, 2024

gjnave commented Nov 30, 2024

eoffermann commented Dec 5, 2024 • edited Loading

able2608 commented Dec 5, 2024

eoffermann commented Dec 5, 2024 •

edited

Loading