Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how can I observe the change of gamma in the attention block? #1

Open
w617156977 opened this issue Sep 16, 2019 · 8 comments
Open

how can I observe the change of gamma in the attention block? #1

w617156977 opened this issue Sep 16, 2019 · 8 comments

Comments

@w617156977
Copy link

Thanks for sharing your code!
I want to observe how the gamma coefficient change in the network, but I got some problem with keras, can you give me some advice?

@kiyohiro8
Copy link
Owner

Thanks for your comment.
Did you mean that you want to take the gamma coefficient at specific time in training process?

@w617156977
Copy link
Author

@kiyohiro8 Yes, I want to draw a curve of the gamma coefficient in a specific layer during the training process.

@w617156977
Copy link
Author

Another problem , my code is easy to out of memory, it did not work when I try to reduce the channel number to 1/8.

@kiyohiro8
Copy link
Owner

drawing a curve of the gamma coefficient
OK. I'll try to draw a curve of the gamma coefficient. Please wait for a few days.

out of memory
Self-Attention layer is easy to out of memory, because its nodes is equal to squared of input layer's node. So it is necessary to reduce the input image size or use gpu with large memory. I used GTX1080Ti for training with this code.

@w617156977
Copy link
Author

Thanks for your reply.
In 'SelfAttentionLayer.py' , self.filters_f_g = self.channels // 8 seems to reduce the parameters, but no matter I add this line or not, my code is still out of memory. I wanna know if my understanding is right?

@kiyohiro8
Copy link
Owner

Yes. The memory consumption of Self-Attention layer is more affected by height and width of feature map than channels. And I thought that self.filters_f_g = self.channels // 8 means aggregating information than reducing memory.

@kiyohiro8
Copy link
Owner

I added some code to get the gamma coefficient in Self-Attention layers. Please check line 130-132 in SAGAN.py.

@w617156977
Copy link
Author

@kiyohiro8 Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants