You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am very impressed with how you enforce constraints with Lagrange multipliers.
In the paper, I notice that affine layers are encoded with z(i) = W(i)z(i-1)+b(i), which only captures fully-connected/convolutional/... layers's behavior.
But for an Add layer in residual networks in ONNX model, its function is like z(i) = z(i-1)+z(i-k). I fail to see how your process extends to residual networks, but I did observe residual networks in your experiments.
So I wonder if there is a theorem behind handling the residual networks? And is this theorem (if any) just a customization of your existing version?
Thank you in advance for your clarification!
The text was updated successfully, but these errors were encountered:
Dear,
I am very impressed with how you enforce constraints with Lagrange multipliers.
In the paper, I notice that affine layers are encoded with z(i) = W(i)z(i-1)+b(i), which only captures fully-connected/convolutional/... layers's behavior.
But for an Add layer in residual networks in ONNX model, its function is like z(i) = z(i-1)+z(i-k). I fail to see how your process extends to residual networks, but I did observe residual networks in your experiments.
So I wonder if there is a theorem behind handling the residual networks? And is this theorem (if any) just a customization of your existing version?
Thank you in advance for your clarification!
The text was updated successfully, but these errors were encountered: