-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for explicit padding and dilations in 2D CNN layers #138
base: main
Are you sure you want to change the base?
Conversation
Explicit padding support
Moving dilations to padding fork
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #138 +/- ##
==========================================
+ Coverage 91.67% 92.35% +0.67%
==========================================
Files 29 29
Lines 1730 1752 +22
Branches 327 330 +3
==========================================
+ Hits 1586 1618 +32
+ Misses 74 70 -4
+ Partials 70 64 -6 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. See the below comments. It also would be nice to test more of the errors, but I know that it is pretty tedious.
raise ValueError( | ||
f"{node.name} input has {len(node.input)} dimensions, only nodes with 2 or 3 input dimensions can be used as starting points for consumption." | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phraseology consumption
is not very intuitive. If I were a user and fed my CNN into OMLT and got this error, it would be clear to me what to do to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was cribbing from the docstrings, but after a bit more time with the code I've reworded all of these errors slightly. The node type checks should never be hit unless there's a bug in the calling function, and '''._visit_node()''' doesn't have that problem. The dimension checks are hopefully a bit clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I think we can be more specific like: To define a convolution layer, the input/output channels and weight matrix are required. Biaes is optional.
if "dilations" in attr: | ||
dilations = attr["dilations"] | ||
else: | ||
dilations = (len(input_output_size) - 1) * [1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line needs test coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coverage added to the non-dilation case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR already adds tests with dilations, old tests already cover cases without dilations. We can check the coverage after the CI issue is fixed.
src/omlt/neuralnet/layer.py
Outdated
), | ||
] | ||
) | ||
self.__dilated_kernel = dilate_kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line needs testing coverage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. This part is important.
@property | ||
def dilated_kernel(self): | ||
"""Return the dilated cross-correlation kernel""" | ||
return self.__dilated_kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function needs to be tested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test added
if lb is None: | ||
raise ValueError("Expression is unbounded below.") | ||
if ub is None: | ||
raise ValueError("Expression is unbounded above.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What expression? More context should be provided to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't currently have the knowledge to give a better explanation - can we make a note that these still need work and come back to them later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the auxiliary variables introduced in partition formulation will always be bounded, and wonder if we really need these error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also update the CNN Jupyter notebook (https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/mnist_example_convolutional.ipynb) so that it incorporates the new work.
activation=activation, | ||
input_index_mapper=input_index_mapper, | ||
) | ||
self.__kernel = kernel | ||
if self.dilations != [1, 1]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, when dilations!=[1, 1], this part will extend the kernel from shape [out_channel, in_channel, r, c] to shape [out_channel, in_channel, dr*(r-1)+1, dc*(c-1)+1] by adding zeros, where [dr, dc] are dilations. But this part of code does not correctly do this extension. For example, assume the size of kernel is [2, 1, 4, 3] and dilations = [2, 3], then the size of dilate_kernel should be [2, 1, 7, 7]. But we will get an error here due to unmatched sizes. My suggestion is directly assigning values to dilate_kernel instead of using numpy.hstack and numpy.dstack.
@@ -467,17 +478,11 @@ def _consume_pool_nodes(self, node, next_nodes): | |||
kernel_depth = attr["kernel_shape"][0] | |||
kernel_shape = attr["kernel_shape"][1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kernel size for maxpool2d does not include input/output channel. There are only two options for kernel size: (i) one integer k, then the kernel size is [1, 1, k, k]; (ii) two integers [r, c], then the kernel size is [1, 1, r, c]. We need to change lines 478-479 to get correct kernel_depth (which should be 1) and kernel_shape (which should be [k, k] or [r, c]). After fixing that, the checking in 490-492 makes sense. Otherwise, we will get the error message like "Kernel shape [4] has 1 dimensions. Strides attribute has 2 dimensions. These must be equal."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I read the correct kernel_depth and kernel_shape from the node attribute? Do I need to count the dimensions to determine whether it's been given as k, [r,c], or [1,1,r,c]? Or do I just have the indices wrong here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to determine which case since ONNX already standardizes the dimensions for kernel. For maxpooling2d, the node atttribute will give [r, c] (or [k,k] for case(i)), so we just need to define kernel_shape as attr["kernel_shape"]. Since the output channels equal to the input channels, just define kernel_depth as in_channels will be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This code predates this PR.)
I've now set kernel_depth to in_channels.
In the maxpool_2d.onnx file used in the test_onnx_parser/test_maxpool test, attr["kernel_shape"] for node1 is (3, 2, 3). I'm not sure what each of these dimensions represents, but if I take all 3 it fails.
@@ -464,20 +475,14 @@ def _consume_pool_nodes(self, node, next_nodes): | |||
in_channels = input_output_size[0] | |||
|
|||
attr = _collect_attributes(node) | |||
kernel_depth = attr["kernel_shape"][0] | |||
kernel_depth = in_channels | |||
kernel_shape = attr["kernel_shape"][1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think kernel_shape = attr["kernel_shape"] since the first dimension is no longer the depth in maxpooling. Can you put some tests on this part to see which one is correct?
Some options for this. We could:
If there isn't an obvious answer let's discuss. |
To address issue More features for CNN in OMLT #129, including padding and dilations for 2D layers in CNNs. I've made changes to the Layer2D class and its subclasses PoolingLayer2D and ConvLayer2D, as well as the network parsing functions that will build them.
This is passing tests that demonstrate it hasn't broken anything for layers without padding or dilations, but I haven't had a good example to validate that it's producing the correct results for layers with these attributes.
Not implemented:
Legal Acknowledgement
By contributing to this software project, I agree my contributions are submitted under the BSD license.
I represent I am authorized to make the contributions and grant the license.
If my employer has rights to intellectual property that includes these contributions,
I represent that I have received permission to make contributions and grant the required license on behalf of that employer.