-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generating onnx file for the inference of Mamba? #200
Comments
We have no experience with ONNX. Do you have ideas on how to generate onnx for custom operations? If so would you like to contribute? |
Thanks @tridao ! I am working towards that direction (here is a how to do it: https://github.com/onnx/tutorials/blob/master/PyTorchCustomOperator/README.md) Could you let me know what are other custom operators if any other than scan? I know the code for scan is here (if I am not mistaken): https://github.com/state-spaces/mamba/blob/main/csrc/selective_scan/selective_scan.cpp Any chance, if you have standalone implementation of scan? |
@llmexperiment You may be interested in looking at the HF transformers implementation (PR here), which supports a fallback if |
Hey Im interested in converting the Vision Mamba - Vim paper to onnx but have not had success. I decided to start working with mamba layers first and then proceed there. This is the current status of my code stack trace
from what I understand, torch.arange creates dynamic arguments which |
replace the optimized triton operations with original torch operations. |
could you share more details on your implementation? |
I noticed that when using the original torch operations, |
|
If you are interested we have a few threads also at: pytorch/pytorch#130150 (selective_scan custom ops) pytorch/pytorch#95408 (comment) (native selective_scan and associative_scan) pytorch/pytorch#120189 (mamba native). |
Can you show me the code?Thanks! |
Can you show me the code?Thanks! |
Dear @tridao , @albertfgu ,
It looks like it is not straightforward to generate onnx file due to following reason using torch.onnx.export:
Above two prevents (based on my understanding) generating onnx file. It would be great to have onnx file for the inference part for the smallest model.
Any suggestions how we can generate onnx file for the inference? (also for training separately)?
The text was updated successfully, but these errors were encountered: