-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Batch Normalization Layer modules #157
base: main
Are you sure you want to change the base?
Conversation
Thank you @Spnetic-5, great. Yes, this is certainly in the correct direction. I haven't yet reviewed the forward and backward algorithms, let me know when it's ready for review. Also, despite #156, we should be able to test this layer implementation on its own, without integrating it with the network. In other words, while #156 will be necessary for full integration, we can work on this implementation as a standalone layer before #156. |
@milancurcic Please review the forward and backward pass implementations I've added based on my interpretation of the paper, also could you guide me on how we can test this layer? |
Thanks! Let's review it briefly on the call today. We can test this layer independently by passing some small, known input, and comparing the result with the corresponding known output. This should be straightforward since the batchnorm operation is relatively simple (just normalization of data). The backward is the inverse operation, so as I understand it, we can pass the same expected output to recover the same expected input. |
@Spnetic-5 I just saw your message on Discourse, no problem; we'll proceed work on this PR as usual. |
See for example a program that tests forward and backward passes of the maxpool2d layer using known inputs and expected outputs: https://github.com/modern-fortran/neural-fortran/blob/main/test/test_maxpool2d_layer.f90 We'd use the same approach to test a batchnorm layer. |
Hello, @milancurcic. Sorry for the lack of activity over the past few days; this was my final week of internship in Canada, and I'll be returning to India on Monday. I added a test module for the batch norm layer, however it has some error; I believe I will need your assistance on this. |
No worries at all, thanks for all the work. I'll review it tomorrow. |
allocate(res % input(num_features, num_features)) | ||
allocate(res % output(num_features, num_features)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the shape for these should be (num_features, num_features)
, but rather (batch_size, num_features)
. The batch_size
also won't be known until the first forward pass, so we should defer the allocation until then. In the forward pass, we could have a simple allocated
check to see if they have not been allocated then, and allocate them to the shape of the input
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant (num_features, batch_size)
.
normalized_input => (input - reshape(self % running_mean, shape(input, 1))) & | ||
/ sqrt(reshape(self % running_var, shape(input, 1)) + self % epsilon) & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running_mean
and running_var
are not yet updated anywhere, only initialized.
sample_input = 1.0 | ||
gradient = 2.0 | ||
|
||
!TODO run forward and backward passes directly on the batchnorm_layer instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a simple test directly on the batchnorm_layer
instance rather than the high-level layer_type
instance so you get the idea how it will work.
Addresses #155
@milancurcic , I've included the structure of the batch normalization layer. Could you please review it and confirm whether I'm in the correct direction?