You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Local NVIDIA env:
(llava) xuyang@nobisuke:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0
Python=3.10.4
Torch==2.0.1+cu117
Who can help?
No response
Information
The official example scripts
My own modified scripts
Tasks
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction (minimal, reproducible, runnable)
from optimum.bettertransformer import BetterTransformer
model = BetterTransformer.transform(model)
Expected behavior
Recently, we sought to apply the optimum.bettertransformer in LLAVA for fine-tuning. The code run successfully and we found that the memory has decreased significantly.
As for decoder models we do not use nested tensors and simply rely on SDPA, I will not be adding support for more models in optimum.bettertransformer and am instead looking to increase SDPA coverage in Transformers.
I opened the issue huggingface/transformers#28005 in Transformers to track the support. Please continue the discussion there!
System Info
Local NVIDIA env: (llava) xuyang@nobisuke:~$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Fri_Jan__6_16:45:21_PST_2023 Cuda compilation tools, release 12.0, V12.0.140 Build cuda_12.0.r12.0/compiler.32267302_0 Python=3.10.4 Torch==2.0.1+cu117
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction (minimal, reproducible, runnable)
Expected behavior
Recently, we sought to apply the optimum.bettertransformer in LLAVA for fine-tuning. The code run successfully and we found that the memory has decreased significantly.
However, in https://huggingface.co/docs/optimum/v1.15.0/bettertransformer/overview, we found that LLAVA is not in the support list.
Therefore, we want to confirm that can bettertransformer employ for pre-training or fine-tuning in LLAVA now?
The text was updated successfully, but these errors were encountered: