Memory Leak when running encoding in ThreadPool #1854

JoanFM · 2023-03-03T17:06:21Z

I have seen a curious behavior when running the encoding of a sentence-transformer model insida a threadPool.

Look at this code which runs with no problem and constant memory consumption:

from sentence_transformers import SentenceTransformer

if __name__ == '__main__':
    model = SentenceTransformer('msmarco-distilbert-base-v3', device='cpu')


    def f():
        texts = ['testsdkjsdlfajsdlslfk jofjiwo wofj owifjwo ijwoifj ofj o3jpovpopor3j'] * 30
        embeddings = model.encode(texts)
        print(f' embeddings shape {embeddings.shape}')


    while True:
        f()

On the other hand, this code blows the system memory and rapidly leads to an OOM:

from sentence_transformers import SentenceTransformer
import asyncio

if __name__ == '__main__':
    model = SentenceTransformer('msmarco-distilbert-base-v3', device='cpu')


    def f():
        texts = ['testsdkjsdlfajsdlslfk jofjiwo wofj owifjwo ijwoifj ofj o3jpovpopor3j'] * 30
        embeddings = model.encode(texts)
        print(f' embeddings shape {embeddings.shape}')


    while True:
        loop = asyncio.new_event_loop()
        loop.run_in_executor(None, f)

The only difference is that the encoding happens inside a Thread.

The text was updated successfully, but these errors were encountered:

JoanFM · 2023-03-03T17:10:23Z

Seems potentially related to:

pytorch/pytorch#64412

chschroeder · 2023-03-04T22:35:22Z

In #1795 a sample similar to your non-threaded variant also seems to fail. Not sure if this is related but there have been multiple issues regarding memory leaks lately.

chschroeder · 2023-03-13T22:41:32Z

Update: I just had a few minutes and tried the above script when everything within the main loop of encode() is deleted except for tokenize():

for start_index in trange(0, len(sentences), batch_size, desc="Batches", disable=not show_progress_bar):
    sentences_batch = sentences_sorted[start_index:start_index + batch_size]
    features = self.tokenize(sentences_batch)
    del features
return []

The leak is showing even then. Once I delete tokenize (and return constant output) the leak is gone. So there is at least a problem in tokenize().

(Off Topic: Why is this extra Transformer class needed?)

This lead me to a very old issue here: huggingface/transformers#197. This is my progress so far, will stop now, but maybe these notes will help.

Edit: similar issue here

Possible memory leak when inferencing BLOOM 176B huggingface/accelerate#614

Edit2:
I forgot to add that I changed to model to paraphrase-multilingual-MiniLM-L12-v2. With the above script:

sentence-transformers/msmarco-distilbert-base-v3: no leak
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2: leak
sentence-transformers/all-MiniLM-L12-v2: no leak
sentence-transformers/paraphrase-mpnet-base-v2: no leak

Seems there are multiple problems but the tokenizer is only an issue for the model sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2. I stumbled upon this one because I had problems with this model in another context.

AnghelRA · 2024-04-05T08:57:56Z

Found the same issue when encoding an image with clip, when in a thread the memory continues to increase, but when running in a loop, the memory remains constant.

avaz mentioned this issue Mar 17, 2023

Segmentation fault when loading two (or more) models in the same process and using them concurrently. #1867

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Leak when running encoding in ThreadPool #1854

Memory Leak when running encoding in ThreadPool #1854

JoanFM commented Mar 3, 2023

JoanFM commented Mar 3, 2023

chschroeder commented Mar 4, 2023

chschroeder commented Mar 13, 2023 •

edited

Loading

AnghelRA commented Apr 5, 2024 •

edited

Loading

Memory Leak when running encoding in ThreadPool #1854

Memory Leak when running encoding in ThreadPool #1854

Comments

JoanFM commented Mar 3, 2023

JoanFM commented Mar 3, 2023

chschroeder commented Mar 4, 2023

chschroeder commented Mar 13, 2023 • edited Loading

AnghelRA commented Apr 5, 2024 • edited Loading

chschroeder commented Mar 13, 2023 •

edited

Loading

AnghelRA commented Apr 5, 2024 •

edited

Loading