-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unhashable type: non-nested SymInt
#1381
Comments
@bdhirsh @jerryzh168 Do you think it is a duplicate of pytorch/pytorch#136287? |
@bhack can you try with pytorch nightly as well
|
It was already on pytorch nightly and ao nightly. |
I see, in that case can you give us a minimal repro for the issue |
It is a quite large model that I cannot share. Can I give you any debug element? |
I see, maybe @bdhirsh can provide some pointers, looks like this is related to torch.compile |
Ok, in the mean time I was seeing some example in SAM2 server for ao + AOTI. As in this case I am trying to use both what is in general the best practice to use ao with AOTI? |
some example in https://gist.github.com/sayakpaul/de0eeeb6d08ba30a37dcf0bc9dacc5c5 (ao quant + AOTI) and https://github.com/sayakpaul/diffusers-torchao/blob/main/inference/aot_serialization.md (general AOTI) |
Thanks I see the example isn't with |
If this could help I have this error with autoquant(model, qtensor_class_list=DEFAULT_FLOAT_AUTOQUANT_CLASS_LIST,min_sqnr=40) Instead with quantize_(model, int8_dynamic_activation_int8_weight())
model = unwrap_tensor_subclass(model) I got cannot mutate tensors with frozen storage
While executing %add_ : [num_users=1] = call_function[target=torch.ops.aten.add_.Tensor](args = (%to_2, %backbone_blocks_0_attn_qkv_bias), kwargs = {})
.....
qkv = self.qkv(x).reshape(B, H * W, 3, self.num_heads, -1).permute(2, 0, 3, 1, 4) |
For the 2nd case where we had But now we have the same error While executing %add__24 : [num_users=1] = call_function[target=torch.ops.aten.add_.Tensor](args = (%to_26, %backbone_blocks_0_mlp_fc2_bias), kwargs = {})
Original traceback:
....
File "/opt/conda/lib/python3.11/site-packages/timm/layers/mlp.py", line 46, in forward
x = self.fc2(x) |
Let me know if I can debug this more both |
I'll let Jerry comment on the |
@bdhirsh Right but for the mutation case I really cannot track the source point for the clone. I've already added the clone in many places but it seems that it is always failing with the same |
what is the current stacktrace that you get? hopefully it points to somewhere useful in the user python stack closer to where the mutation was (if not, if you have some repro code I can take a look too. Although ideally we can make the user stack situation better) |
I don't think the stack is a lot useful While executing %add_ : [num_users=1] = call_function[target=torch.ops.aten.add_.Tensor](args = (%to_2, %backbone_blocks_0_attn_qkv_bias), kwargs = {})
Original traceback:
features = self.backbone(inputs)
File "/workspace/modeling/backbone/vit.py", line 420, in forward
x = blk(x)
File "/workspace/modeling/backbone/vit.py", line 279, in forward
x = self.attn(x)
File "/workspace/modeling/backbone/vit.py", line 70, in forward
qkv = self.qkv(x).reshape(B, H * W, 3, self.num_heads, -1).permute(2, 0, 3, 1, 4) |
yeah that stacktrace alone isn't very helpful, although it doesn't look complete. If you're able to provide the entire stacktrace, or links to the |
|
I didn't see an obvious source of the inplace
|
quantize_(model, int8_dynamic_activation_int8_weight())
model = unwrap_tensor_subclass(model) |
I haven't tested the serialization for autoquant yet, last time I tried there seems to be some issues with performance, will probably debug a bit later |
@jerryzh168 Do you have any hint on where we are injecting this cannot mutate tensors with frozen storage
While executing %add_ : [num_users=1] = call_function[target=torch.ops.aten.add_.Tensor](args = (%to_2, %backbone_blocks_0_attn_qkv_bias), kwargs = {})
.....
qkv = self.qkv(x).reshape(B, H * W, 3, self.num_heads, -1).permute(2, 0, 3, 1, 4) Cause this was exported but with exactly the same code this only appear after we add quantization before the quantize_(model, int8_dynamic_activation_int8_weight())
model = unwrap_tensor_subclass(model) |
I see a similar mutate |
I was able to repro locally - looking around a bit, the mutation was coming from here. I patched torchao here, and with that patch I could run the export E2E: #1387. There are two things that are still worth looking into on the export side: (1) Ideally the error message from export should have properly pointed to that culprit code. It looks like the stack is getting lost somewhere (cc @tugsbayasgalan, @yushangdi ) (2) we should also actually fix pytorch/pytorch#127571 so no user changes are required in the first place |
@bdhirsh thanks for confirming the repro. |
So the situation with that (1) if you are using (2) if you are using export, you currently need that API. @tugsbayasgalan is working on a more general-purpose change to |
Yes I saw that PR. It is just that the current AO doc is ambiguous about the compile Vs export use case. I don't know if you have the time but as you have the code now can you confirm also the |
Just tried, looks like that doesn't work. with
with non-strict export (
|
Thanks, let me know if we could track these here or if we need new tickets. |
@bdhirsh for the autoqaunt error, how did you run it? you'll need to feed the model with some input before export to trigger autoquant: https://github.com/pytorch/ao/blob/main/torchao/quantization/README.md#autoquantization something like this:
for |
@jerryzh168 The |
With pytorch and autoquant nighties:
The text was updated successfully, but these errors were encountered: