Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for explicit padding and dilations in 2D CNN layers #138

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

jezsadler
Copy link
Collaborator

To address issue More features for CNN in OMLT #129, including padding and dilations for 2D layers in CNNs. I've made changes to the Layer2D class and its subclasses PoolingLayer2D and ConvLayer2D, as well as the network parsing functions that will build them.

This is passing tests that demonstrate it hasn't broken anything for layers without padding or dilations, but I haven't had a good example to validate that it's producing the correct results for layers with these attributes.

Not implemented:

  • use of autopad
  • biases

Legal Acknowledgement
By contributing to this software project, I agree my contributions are submitted under the BSD license.
I represent I am authorized to make the contributions and grant the license.
If my employer has rights to intellectual property that includes these contributions,
I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

Copy link

codecov bot commented Nov 17, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (a3d128d) 91.67% compared to head (97ba41b) 92.35%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #138      +/-   ##
==========================================
+ Coverage   91.67%   92.35%   +0.67%     
==========================================
  Files          29       29              
  Lines        1730     1752      +22     
  Branches      327      330       +3     
==========================================
+ Hits         1586     1618      +32     
+ Misses         74       70       -4     
+ Partials       70       64       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rmisener rmisener requested a review from pulsipher December 4, 2023 15:34
Copy link
Collaborator

@pulsipher pulsipher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. See the below comments. It also would be nice to test more of the errors, but I know that it is pretty tedious.

raise ValueError(
f"{node.name} input has {len(node.input)} dimensions, only nodes with 2 or 3 input dimensions can be used as starting points for consumption."
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phraseology consumption is not very intuitive. If I were a user and fed my CNN into OMLT and got this error, it would be clear to me what to do to fix it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was cribbing from the docstrings, but after a bit more time with the code I've reworded all of these errors slightly. The node type checks should never be hit unless there's a bug in the calling function, and '''._visit_node()''' doesn't have that problem. The dimension checks are hopefully a bit clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I think we can be more specific like: To define a convolution layer, the input/output channels and weight matrix are required. Biaes is optional.

if "dilations" in attr:
dilations = attr["dilations"]
else:
dilations = (len(input_output_size) - 1) * [1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line needs test coverage.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage added to the non-dilation case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR already adds tests with dilations, old tests already cover cases without dilations. We can check the coverage after the CI issue is fixed.

),
]
)
self.__dilated_kernel = dilate_kernel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line needs testing coverage

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test added

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. This part is important.

Comment on lines +608 to +611
@property
def dilated_kernel(self):
"""Return the dilated cross-correlation kernel"""
return self.__dilated_kernel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function needs to be tested

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test added

if lb is None:
raise ValueError("Expression is unbounded below.")
if ub is None:
raise ValueError("Expression is unbounded above.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What expression? More context should be provided to users.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't currently have the knowledge to give a better explanation - can we make a note that these still need work and come back to them later?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the auxiliary variables introduced in partition formulation will always be bounded, and wonder if we really need these error messages.

@rmisener rmisener self-requested a review December 14, 2023 16:25
Copy link
Member

@rmisener rmisener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also update the CNN Jupyter notebook (https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/mnist_example_convolutional.ipynb) so that it incorporates the new work.

activation=activation,
input_index_mapper=input_index_mapper,
)
self.__kernel = kernel
if self.dilations != [1, 1]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, when dilations!=[1, 1], this part will extend the kernel from shape [out_channel, in_channel, r, c] to shape [out_channel, in_channel, dr*(r-1)+1, dc*(c-1)+1] by adding zeros, where [dr, dc] are dilations. But this part of code does not correctly do this extension. For example, assume the size of kernel is [2, 1, 4, 3] and dilations = [2, 3], then the size of dilate_kernel should be [2, 1, 7, 7]. But we will get an error here due to unmatched sizes. My suggestion is directly assigning values to dilate_kernel instead of using numpy.hstack and numpy.dstack.

@@ -467,17 +478,11 @@ def _consume_pool_nodes(self, node, next_nodes):
kernel_depth = attr["kernel_shape"][0]
kernel_shape = attr["kernel_shape"][1:]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kernel size for maxpool2d does not include input/output channel. There are only two options for kernel size: (i) one integer k, then the kernel size is [1, 1, k, k]; (ii) two integers [r, c], then the kernel size is [1, 1, r, c]. We need to change lines 478-479 to get correct kernel_depth (which should be 1) and kernel_shape (which should be [k, k] or [r, c]). After fixing that, the checking in 490-492 makes sense. Otherwise, we will get the error message like "Kernel shape [4] has 1 dimensions. Strides attribute has 2 dimensions. These must be equal."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I read the correct kernel_depth and kernel_shape from the node attribute? Do I need to count the dimensions to determine whether it's been given as k, [r,c], or [1,1,r,c]? Or do I just have the indices wrong here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to determine which case since ONNX already standardizes the dimensions for kernel. For maxpooling2d, the node atttribute will give [r, c] (or [k,k] for case(i)), so we just need to define kernel_shape as attr["kernel_shape"]. Since the output channels equal to the input channels, just define kernel_depth as in_channels will be fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This code predates this PR.)

I've now set kernel_depth to in_channels.

In the maxpool_2d.onnx file used in the test_onnx_parser/test_maxpool test, attr["kernel_shape"] for node1 is (3, 2, 3). I'm not sure what each of these dimensions represents, but if I take all 3 it fails.

@@ -464,20 +475,14 @@ def _consume_pool_nodes(self, node, next_nodes):
in_channels = input_output_size[0]

attr = _collect_attributes(node)
kernel_depth = attr["kernel_shape"][0]
kernel_depth = in_channels
kernel_shape = attr["kernel_shape"][1:]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think kernel_shape = attr["kernel_shape"] since the first dimension is no longer the depth in maxpooling. Can you put some tests on this part to see which one is correct?

@jezsadler
Copy link
Collaborator Author

We should also update the CNN Jupyter notebook (https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/mnist_example_convolutional.ipynb) so that it incorporates the new work.

Some options for this. We could:

  1. add dilations and padding to the existing layers in the CNNs
  2. add additional layers with dilations and padding to the existing CNNs
  3. add a new CNN with layers with dilations and padding

If there isn't an obvious answer let's discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants