Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: Cannot re-initialize CUDA in forked subprocess when loading models in Gunicorn #3342

Closed
kkarski opened this issue Oct 17, 2023 · 2 comments
Labels
question Further information is requested

Comments

@kkarski
Copy link

kkarski commented Oct 17, 2023

Question

I am trying to operationalize a few Flair models behind a Flask/Gunicorn REST API.

I am consistently hitting the following error when loading a second model instance in Gunicorn:

Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

This happens whether I try to invoke workers sharing a global SequenceTagger or if I try to instantiate two SequenceTaggers for different models in the same worker.

For example, the following code will throw the above error inside Gunicorn:

posModel = SequenceTagger.load('flair/pos-english-fast')
nerModel = SequenceTagger.load('flair/ner-english-ontonotes-fast')

I suspect this is not directly Flair but PyTorch related however, it still poses an issue in my case.

I have tried the first of the following resolution suggestions without success (the second seems very involved):

https://stackoverflow.com/questions/61120314/cannot-launch-gunicorn-flask-app-with-torch-model-on-the-docker
https://stackoverflow.com/questions/72779926/gunicorn-cuda-cannot-re-initialize-cuda-in-forked-subprocess

Is there a different suggested method of running shared models in production to parallelize processing?

@kkarski kkarski added the question Further information is requested label Oct 17, 2023
@kkarski
Copy link
Author

kkarski commented Oct 17, 2023

BTW, running the same code on just the Flask dev server works fine, it's just when deployed to Gunicorn when things break.

@helpmefindaname
Copy link
Collaborator

Hi @kkarski
as you noticed, this is an issue with torch & gunicorn, hence I'd suggest you forward this problematic to this issue.
It doesn't make sense for Flair to put out any suggestions on that topic, hence I am closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants