-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os-atlas-pro-7b crashes #2875
Comments
Hi @louisabraham thank you for opening this issue, I believe this is a similar issue to #2879 and may be related to a bug with cuda graphs and rotary embeddings. Looking for a good fix now and will update this issue once resolved. In the meantime I believe its possible to avoid this issue by setting the environment var CUDA_GRAPHS=0 |
Thanks! indeed |
Hi, I have another issue when trying to increase the context length past 32768.
It's weird because I can do it for smaller shapes like 30000. |
System Info
I'm using an Inference Endpoint with this code
https://huggingface.co/OS-Copilot/OS-Atlas-Pro-7B/tree/6c0135de0627db98533ac4b47ae71fa17cf21c48
Information
Tasks
Reproduction
spawn an inference endpoint with https://huggingface.co/OS-Copilot/OS-Atlas-Pro-7B/tree/6c0135de0627db98533ac4b47ae71fa17cf21c48
Expected behavior
I get this error
The text was updated successfully, but these errors were encountered: