Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author Response #4

Open
ameya98 opened this issue Jun 29, 2021 · 0 comments
Open

Author Response #4

ameya98 opened this issue Jun 29, 2021 · 0 comments

Comments

@ameya98
Copy link
Collaborator

ameya98 commented Jun 29, 2021

We would like to thank all reviewers for their detailed and constructive feedback.

Summary of Major Changes

  • Our article’s narrative has shifted: we start off with the easier-to-explain ‘localized’ methods first, and then move on to spectral convolutions.
  • We have added a new visualization of the spectral decomposition of natural images, which helps the reader understand the Laplacian spectrum better.
  • Our core visualizations remain the same otherwise, except to improve accessibility and fix UI issues.
  • We have dropped the section on the Game of Life.

Description of Major Changes

We would like to share more details of the major changes we made, which address multiple reviewers’ comments at once.

Reordering Sections of Different Methods

Our original ordering of the different sections (Spectral Convolutions, then Polynomial Filters and finally Modern GNNs) was purely chronological, which in hindsight was not a great choice. As Reviewer #3 pointed out, while spectral convolutions are the most mathematically involved and hardest to understand, the reader does not actually need to understand them to appreciate the later sections which use arguably simpler techniques.

Following the suggestions of both Reviewer #2 and Reviewer #3, we have decided to re-arrange these sections in the following order:

  • Polynomial Filters
  • Modern GNNs
  • Spectral Convolutions

We have consequently modified the narrative to flow with this new order better.

We believe that this change should significantly improve the readability of our article.

The specific changes for each of the affected sections are described in the next sections.

Modification of Figure in ‘Extending Convolutions to Graphs’

With the distinction between spectral and spatial convolutions now changed to local vs global convolutions, we have replaced the figure in this section with a figure showing how localized convolutions on a graph are similar to standard convolutions on a grid. This emphasizes the connections made between CNNs and modern GNNs in the text, both here and in later sections.

Revisions to ‘Spectral Convolutions’

All three reviewers found the section on spectral convolutions quite dense for a general audience to understand.

We have addressed this with the following changes:

  • We have improved the visualization of the Laplacian eigenvectors on a grid, by showing the individual eigenvectors underneath the sliders. We have additionally removed the ‘Number of Eigenvectors’ slider as we realized it didn’t add to the narrative.
  • We have added a new visualization of the spectral decomposition of sample images from ImageNet. In particular, this visualization emphasizes that keeping the first few components of the spectral representation are sufficient to recover much of the information of the image.
  • We have changed the mathematical notation to be clearer, and added references to a simple tutorial on eigenvectors and eigenvalues.

Revisions to ‘Polynomial Filters’ (earlier ‘From Global to Local Convolutions’)

We found that polynomial filters of node neighbourhoods can be explained without bringing in the background about the Laplacian spectrum, making this section significantly more accessible.

Further, we cleaned up the visualization of the convolution with polynomial filters:

  • We show the equivalent convolutional kernel at the highlighted position. Since this kernel depends on the position, we allow users to hover over the input grid to change their choice of highlighted position.
  • We have reduced the maximum degree of the polynomial from 4 to 2, in the interest of clarity. For readers interested in changing this and other parameters (such as the grid size), we have shared a forkable ObservableHQ notebook, just as for all the other interactive visualizations in this article.
  • Since we have not introduced what a spectral representation of a feature vector is, we have removed this component from this visualization.

Removing the ‘Game of Life’ Section

After much deliberation, we have decided to remove this section of the article. While it is definitely an interesting question of whether GNNs could perform better at the Game of Life than CNNs do, in some sense this question is orthogonal to the article’s goals of introducing and understanding GNNs. The reviewers felt this section needed much work in its current state, and we could not find a suitable re-adjustment that would fit well in this paper. We considered the following options:

  • Experimenting with more powerful models:
    • Reviewer 1 suggested considering models with more powerful aggregation schemes, such as Principal Neighbourhood Aggregation (PNA). Our entire experiment revolves around the definition of a ‘minimal model’ for each choice of architecture. PNA consists of multiple aggregators which can individually distinguish different types of neighbourhoods. However, to solve the Game of Life problem, only the ‘sum’ aggregator is essential. As a result, the ‘minimal model’ for PNA would be essentially the ‘GCN minimal model’ for which we have presented results here.
  • Exploring changes to the existing models:
    • We considered looking at different activation functions apart from ReLU (which famously suffers from the dead-neuron problem) and different optimizers. As referred to in the article, we originally mimicked the architectural choices from It's Hard for Neural Network To Learn the Game of Life for a fair comparison. The overall conclusion from our experiments is actually similar to theirs: it is not clear why our minimal models cannot learn the Game of Life in practice. This is then a question of why our optimization methods fail to recover the optimal solution. This is still very much an open question in understanding the dynamics of neural network training, and is probably worth a separate article on its own.

Reviewer #2 suggested a deeper dive into the learned weights: we did find that the CNN and GNN models tend to make similar errors, but we could not identify any simple reasons to explain why these patterns were manifesting.

Our choice to include this section originally was because we found these experiments interesting in their own right (as Reviewer #1 found them too). As mentioned above, we do believe that these results and their implications for neural network optimization are worth investigating further, but we deemed them out-of-scope for the current article, which aims to provide an accessible introduction to GNNs.

Fixing Accessibility Issues

Several comments about the inaccessibility of the color scheme used in the interactive GNNs equations and visualizations were completely valid. We have now utilized a new feature in the Chrome browser’s DevTools to simulate various types of visual deficiencies to debug accessibility issues in our article, and referred to Coloring for the Color Blind to find a significantly more accessible color scheme for this section. Finally, we have validated our final article for W3C compliance using the W3C Markup Validation Service which ensures our article can be processed correctly by screen-readers.

Separating out Mathematically Heavy Content

The reviewers mentioned that some sections involved too much mathematical machinery which affected the readability of the article. We do want to add these details for the interested reader, while still keeping the article accessible for a general audience. Hence, we have now separated out mathematical details in specially highlighted boxes with the title ‘Details for the Interested Reader’ wherever applicable.

Responses to Individual Reviewers

A significant portion of the reviewers’ feedback have been addressed by the above changes. Here, we address specific feedback from each of the reviewers.

Reviewer #1

The introduction is well written, but makes it seem like there was no ML on graphs before GNNs. Maybe worth mentioning graph kernels and random walk based methods, for e.g.

We now do so, in a brief paragraph.

Maybe it's better to say 'in x section, we'll explore how the ...' because the current sentence made me click and jump over to the new section without reading anything before it.

Maybe before introducing notation, it will be helpful to say in plain english what a GNN does, e.g., GNNs iteratively build representations for each node in the graph (via xyz process).

Thank you; we have addressed these issues.

The figure here could definitely do with interactivity! It would add to the coolness of the presentation if, e.g. I can see which neighbors contribute to the update of which node by hovering over it.

Great suggestion! Our new diagram in this section which shows the similarities across neighbouring pixels in CNNs and neighbourhoods of nodes (replacing the older one comparing spectral and spatial convolutions) is interactive. By hovering over a node in the original graph on the left, the immediate neighbourhood is highlighted and arrows are drawn to the updated node on the right.

I would suggest reversing the order of these two points for more impact: ConvNets are great for images --images are grid graphs --convolution on graphs.

Done!

It would be helpful to say what \hat x_i is here.

Done!

In general, this particular section does have many linear algebra terms being thrown at the reader, so I'd suggest either adding more footnotes to solidify and build intuitions/analogies or provide useful links.

We have changed the notation here to be more accessible to readers. Further, we also add a footnote linking to a tutorial on eigenvalues and eigenvectors.

why can't I just convolve/multiply the natural representations of the feature and weight? Why do I need to bother with the spectral representations?

We have added a short paragraph describing the issue. Your other comments about this section should also be addressed by the rewrite of this section.

I believe this is a recent line of work by Ron Levie et al. worth looking into.

Thank you for this reference! We have added this to our discussion on transferability of spectral convolutional filters.

Maybe the colors are too similar to each other here?

We have fixed this: thank you for this feedback!

Game of Life

This section has been removed in the latest version of this article. Please see our detailed comments above for our thoughts.

On diagrams, I liked the interactive diagrams a lot in how they helped me understand the concepts. Their graphic design may be iterated upon for better incorporating design best practices.

Thank you for this feedback. We have reworked the design of a few interactive diagrams to be more user-friendly. We hope you also enjoy the new visualizations in the latest version of this article!

On writing and readability, there are several math heavy sections -- I feel that the authors could help readers by being very explicit when using a symbol defined many paragraphs ago, or provide more intuitive understanding of a couple mathematical concepts related to spectral convolutions.

We hope the revisions to the Spectral Convolutions section described above improve the readability of this article.

Reviewer #2

Graph networks are a growing subfield, and definitely a topic that needs a clear introduction. Many existing introductions are overly technical and difficult for a newcomer to parse. While I do have some complaints about this article as it stands, I also think it has a great deal of potential, and can be an extremely valuable contribution to Distill and the community at large with some adjustments. In my review form, I’ve marked a few areas as a 3, but I really feel that there is a clear path to improvement in these areas and with a bit of revision, I think this article will be excellent.

Thank you! We have made significant revisions in the latest draft with your comments in mind.

In particular, I think drawing stronger parallels to convolutional networks early in the article, perhaps accompanied by visualizations showing how a pixel grid can be interpreted as a graph and how common graph network operations work when applied to a familiar pixel grid would greatly improve approachability for readers who are familiar with CNNs but are new to graph networks. Even readers who are not as familiar with CNNs would likely benefit from seeing examples on a familiar grid structure with a concrete application in mind.

We have added an interactive figure in ‘Extending Convolutions to Graphs’ showing the similarities between convolutions in CNNs and convolutions in modern GNNs. Further, our visualization of the polynomial filters (ie ChebNet) on a grid now shows an equivalent convolutional kernel. Both of these should significantly help readers transfer concepts from CNNs to GNNs.

I think the section on spectral convolution is a big difficulty jump over the rest of the article, and will probably cause a fair number of readers to tune out. Certainly this is the most mathematically advanced section, and so is going to be the most difficult to explain in an approachable way, but I think there is at least some room for improvement. Drawing comparisons to the image Laplacian may help readers who are new to the concept, as the image Laplacian is much easier to visualize.

As described above, the spectral convolutions section is now at the end of the article; it has been rewritten for clarity, and features a new visualization illustrating the spectral decomposition of images from ImageNet.

More informed than what?

It’s not totally clear that this should be a deal-breaker

This sentence is a little stilted and could be rephrased.

The link in this sentence seems to suggest to the reader that they should jump ahead to that section, but they will likely be lost if they do so.

It may be worth giving an example of what a node feature is.

It may be better to just explain this convention later if it becomes important for a particular example.

This sort of comes out of left field.

Thank you for all of these detailed comments: we have addressed them in our revised draft.

I think it may be helpful to begin this section by drawing a pixel grid as a graph, with RGB values as node features. This would give a reader who is familiar with CNNs but unfamiliar with graph networks a clear bridge between the two. It would also help illustrate why regular convolution does not trivially carry over.

Done! As mentioned above, we have an interactive figure here that shows neighbourhood structures changing for different nodes.

One approach to soften this transition may be to begin with a review of the image laplacian. Describe how it detects smoothness vs edges in an image, and explain that we would like to do the same for a graph. This would make it clear why we’re constructing L the way we are, and why this is likely to be a useful operator. As it is now, the reader has to slog through a bunch of intimidating math before they get the payoff of seeing why.

The Graph Laplacian now lives in the Polynomial Filters section. Here, we describe the construction of the Laplacian, and have an interactive visualization that shows an equivalent convolutional kernel for Laplacian multiplication on a grid. We hope these additions make the connections between CNNs and GNNs clearer.

When hitting ‘reset’, the sliders for the x-hat values do not return to the zero point.

Thank you for noticing this UI bug. We have fixed this here as well as in other visualizations in this article.

It may be useful to draw the image for x-hat_i = 1 (all others = 0) above each slider so that the reader can see the different components that are being intermixed. It might also be useful to show their R_L values so we can see that the earlier ones are smoother.

Great suggestion! We have added a view of individual eigenvectors underneath each of the corresponding sliders. We decided not to add the R_L values to keep the visualization uncluttered.

I think this is the first we’re hearing about filters. There should probably be some explanation that says that just like we try to learn a set of filters in a conv net, we’re going to be learning a set of filters for our graph net, and all this work we did with the laplacian and spectral decomposition is going to show us how to perform a convolution with those filters.

Thank you for this feedback: we have rewritten this section as well for clarity along the lines you brought up.

I’m not sure whether this section on ChebNet is necessary.

With the re-ordering of sections, the Polynomial Filters (ie ChebNet) section is now first. We have rewritten this section to remove references to the Laplacian spectral aspects, and have reworked the visualization here to be clearer. In its current form, we believe the discussion of polynomial filters transitions nicely to the discussion on modern GNNs, without involving much mathematical detail.

It may be unclear to the reader that the text after the diagram also changes when you flip to different architectures.

We have moved these equations into a box with a shaded background just like the other interactive visualizations to show that the buttons control both the diagrams and the text.

I wonder if there is a way to re-order the presentation to begin with these friendlier techniques? I know that the spectral approach and ChebNets came first, but I think there’s some value in giving the reader a monotonic increase in complexity over the course of the article.

Indeed, as described above, we have reordered the sections to make the article more approachable.

This is really the highlight of the whole article. Really great work.

Thank you for the encouraging comments!

​​The Game of Life

This section has been removed in the latest version of this article. Please see our detailed comments above for our thoughts.

It may be helpful to add a sentence here noting that CNNs can take advantage of efficient vectorized convolution on GPUs

Thank you: we have added this.

Reviewer #3

The most novel aspect of the article, in my opinion, is that it exposes three different approaches to graph neural networks - global convolutions, local convolutions, and modern spatial convolutions - into a coherent framework.

Thank you for this feedback! Our revised article continues to expose all of these methods within a common framework but with sections reordered, visualizations reworked and text rewritten for clarity.

There's a long period of setting up the math behind spectral GNNs and their relationship to ChebNet, and one wishes to see a big payoff in the third section about modern spatial convolutions. However, I think for the most part, the payoff is not there.

Or would it work if sections 3, 2, and 1 were reordered as 3, 1 and 2: message passing first, take the infinite limit next, then truncate. It could be worth exploring whether these changes would make the article flow better.

We strongly agree with your comments here. We have reordered the sections in the paper and changed the overall narrative of the article without cutting down on the core content. We hope you find the resulting article significantly easier to read.

but the problem is it never shows the eigenvectors in isolation

Thank you for this feedback. We have added this in the latest draft. The visualization should be much more intuitive now!

One thing that doesn't help is that the area that's covered by the example image is tiny.

Thank you for noticing this. We have enlarged the ‘pixels’ in this visualization.

It may be interesting to show an equivalent convolution kernel in addition to the image acted upon by the convolution kernel.

As described above, we have indeed added this, while also cleaning up this visualization to be significantly easier to parse.

I don't think that the number of eigenvectors slider in The Graph Laplacian interactive adds to the narrative.

We agree with your comments: this slider has been removed, and the number of eigenvectors is now fixed to 10.

There's a lot of use of exclamation points (32!) and I think it's stylistically jarring compared to other distill articles.

Thank you for noticing this: we have cut down on the number of exclamation points.

The text which introduces the interactive visualizations is a little poorly integrated. In other distill articles, the default state of the visualization shows a meaningful example, and there is a caption below to indicate what the figure is supposed to communicate.

We have attempted to improve this in our latest draft. In particular, the default states for our visualizations have been improved to be meaningful even before user input.

It's not clear in the visualizations which are the important buttons to click. For instance in Interactive Graph Neural Networks the key action is Update All Nodes but it's not bolded or in a prominent location.

We have reworked several of the visualizations to be clearer. We would like to point out that in the ‘Interactive Graph Neural Networks’ visualization, there is no necessity for the reader to click on Update All Nodes. The goal of the visualization is to show the update equations at different nodes: the Update All Nodes button shows what would happen if all the update equations are actually applied.

The visualizations take more than one screen on my biggest screen and have lots of whitespace.

Thanks for noticing this: we have fixed the whitespace issue. The visualizations should now fit on one page. We tested out viewing our article on a variety of devices using Chrome’s DevTools to simulate different screen sizes. We realize that there may be browser/device combinations that render the article differently: we would like to follow up separately if that is the case.

Most of the examples are based on data that are well represented by images. Could you have an example that is not a random graph or an image, something that could only be meaningfully done with graph neural nets?

We did consider this, but we felt that it was out-of-scope for the current article, especially since we cover many different methods already. We do however refer the reader to several survey papers that describe the unique applications of GNNs.

it could be made more prominent by more different colors than just shades of red.

Thank you for this feedback: we have replaced this with a more prominent and color-blind friendly color scale.

Overall, a fine contribution to Distill.

Thank you for the encouraging comment!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants