-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When the model is on the CPU, the device of the tensor returned by encode
is cuda
#2694
Comments
encode
is cuda
Hello! This is a bug with Sentence Transformers < 2.3 that caused You should be able to fix this by updating your
|
I upgrade to 3.0.0, the returned tensor is on the cpu now. But the model still uses the gpu even if I moved it to cpu (2.2.2 is good): |
Indeed, the GPU memory is used for 2 reasons:
These each come with a recommendation, you can use one or the other:
If you do recommendation 2, then you don't have to do recommendation 1 anymore, as "cpu" will then be the automatically used device. You can then just load with
|
I understand that the model needs to be loaded onto the GPU first and then moved to the CPU, but why isn't the GPU memory released after moving to the CPU? |
The model doesn't need to be loaded on the GPU at all, you can load it on the CPU straight away with And the memory is released, only the
|
Let's look at the code directly:
To make sure the model is on the CPU
Output:
sentence-transformers: 2.2.2
The text was updated successfully, but these errors were encountered: