Feedback about documentation #196

zhiyil1230 · 2023-02-01T09:25:53Z

It would be good to have an overview of the models available in the readme
Maybe introduce in readme about what does each folder contains?
We could also pass config file with command line.
would be good to see tutorials about how to use/switch or connect different pyg layers and modules in nn/pyg_layers

s-maddrellmander · 2023-02-02T11:52:41Z

Some assorted thought on GOLI after a few weeks developing.

Abstraction makes things neatly reusable but makes understanding where to modify classes and variables complex - ends up with many files open and following references through the code. Not in itself an issue but makes it hard for a new user to understand what modifications are doing if they aren't obviously covered in the config.
I find it a bit confusing thinking as a user how I'm expected to interact with the framework - this may be in part because of developing + adding functionality while also running - but I've found this a bit opaque.
Detailed documentation about how to modify models / configs + training would help - I think this is being developed atm

joao-alex-cunha · 2023-02-02T15:09:15Z

One thing that in my opinion would be useful, is for the user to understand what is the design space of all possible models that can be designed with GOLI. Specifically how the config file defines the design space. I think a good way to achieve this is through a diagram that visually explains this. Showing how a list of input tensors, is processed by the pre_nn, pre_nn_edges, pe_encoders, gnn, post-nn, task heads and metrics. This will help make clear the connections between the previous modules.

callumm-graphcore · 2023-02-09T12:18:27Z

Random thoughts:

Having configs only in YAML means you don't have IDE tooling when you edit them, so it would be nice if it was possible to have a Python object which could be used for configuration
Supporting both DGL and PyG seems excessive and leads to duplicated functionality, but there might be good reasons to do so such as accessing functionality that is accessible in one but not the other
Baking general good practice like norms and dropout into each layer is a great idea but makes it harder to customise some things (for example, you can't choose between "pre-" and "post-" layer norm)
It's hard not to feel like the wheel is being reinvented in a few places (I'm not convinced there's an advantage to using FCLayer over plain nn.Linear?)
I'm sure you're aware of this, but some work needs to be done to convert it to a "library" in the sense of it being a set of components you can use on its own. At the moment, it doesn't seem to be possible to use Goli without doing pip install -e . and living inside it

joao-alex-cunha · 2023-02-16T16:42:36Z

I'm removing the ticket from this sprint, but we can leave it open to collect more feedback in the future

DomInvivo · 2023-02-27T22:19:16Z

General documentation #181

Overview of the FullGraphMultiTaskNetwork, as a drawing Improve documentation #247
Introduce the content of each folder in the Readme Introduce the content of each folder in the Readme #289
Update the tutorials about config files Cleanup + Docs + Parquet #283
Overview of the design space Cleanup + Docs + Parquet #283
Document make_mup_base_kwargs and how muP works Document make_mup_base_kwargs and how muP works #205

Usability

Make cli for different options Add click options for command line #162
Provide one config for reproducing GPS++
Kill DGL support Remove DGL support #233

For the other comments:

Having configs only in YAML means you don't have IDE tooling when you edit them, so it would be nice if it was possible to have a Python object which could be used for configuration

You can pass a dictionary, you just have to create your own main file

Baking general good practice like norms and dropout into each layer is a great idea but makes it harder to customise some things (for example, you can't choose between "pre-" and "post-" layer norm)

I don't think it's necessary to have normalization before a layer. In the case of the MLP and FeedForwardNN, there's the first_normalization option.

I'm not convinced there's an advantage to using FCLayer over plain nn.Linear?
FCLayer has a lot more options, norm, dropout, activation. In some cases, it is used without any of these, and replicates the Linear.

The reason it is used is that it defaults to the mup initialization instead of the pytorch one.

some work needs to be done to convert it to a "library" in the sense of it being a set of components you can use on its own

This is already done internally. We have a pipeline in-place to package it and we will do it once we reach a stable version

callumm-graphcore · 2023-03-02T11:32:36Z

You can pass a dictionary, you just have to create your own main file

The problem for me has more to do with discoverability of the features of the library. It's easier to work out what you need to provide for e.g. a dataclass than a dictionary that needs to have some particular keys.

Some work needs to be done to convert it to a "library" in the sense of it being a set of components you can use on its own

Where is this pipeline? Can we try it out and try to find out what might be missing?

I don't think it's necessary to have normalization before a layer. In the case of the MLP and FeedForwardNN, there's the first_normalization option.

This is the specific paper I was thinking of, but I'll admit I haven't read it in enough detail to know if its ideas are useful in this context: https://arxiv.org/abs/2002.04745

DomInvivo · 2023-03-11T20:46:34Z

Answers to @callumm-graphcore

The problem for me has more to do with discoverability of the features of the library. It's easier to work out what you need to provide for e.g. a dataclass than a dictionary that needs to have some particular keys.

I understand your point. But there are many advantages to a YAML / dictionary.

YAML is quite human-readable.
When dealing classes that have tons of parameters, such as lightning.Trainer, with the yaml you can simply use any feature available within the Trainer. If you use a dataclass, you will have to do a signature check of such classes.
It will make it a bit harder to deal with nested parameters, and with the wandb sweeps.

Of course, it would be great if we can transition to dataclass at some point to have IDE suggestions, but it would require lots of work.

Where is this pipeline? Can we try it out and try to find out what might be missing?

@hadim perhaps you can answer this one? He's asking for the pipeline to package Goli into a library on pip / conda.

For the normalization before the layer, This is the specific paper I was thinking of, but I'll admit I haven't read it in enough detail to know if its ideas are useful in this context: https://arxiv.org/abs/2002.04745

Since this paper is specific to the Transformer layer, then this change only needs to happen inside GPS. But again, we're using a hybrid architecture, so it makes our model way less sensitive to the Transformer stability.

hadim · 2023-03-11T21:11:24Z

As long as the repo is pip-compatible, it should be fine for conda packaging as well. I'll try to package the newest version with conda and let you know.

hadim · 2023-03-11T21:13:14Z

When are you planning to remove dgl? It would be best to not have it for packaging (especially for the open source release since dgl is not on conda forge yet).

DomInvivo · 2023-03-11T21:58:46Z

@hadim

When are you planning to remove dgl?
Soon. Definitly before release.

Note that the IPU stuff requires Pip, poptorch is still not available on conda.

hadim · 2023-03-20T21:10:39Z

A quick update here:

the conda packaging is working after Packaging #254
I have opened a ticket suggesting reviewing all the goli deps and also potentially address some warnings seen during the tests: Dependencies review and warnings #255

DomInvivo · 2023-05-02T21:22:36Z

I will close this issue as it simply regroups many other issues.

zhiyil1230 added this to the Documentation + refactoring milestone Feb 1, 2023

joao-alex-cunha added Graphcore Require Graphcore's expertise Graphcore Sprint Tickets targeted in a GC sprint labels Feb 2, 2023

joao-alex-cunha removed the Graphcore Sprint Tickets targeted in a GC sprint label Feb 16, 2023

DomInvivo closed this as completed May 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback about documentation #196

Feedback about documentation #196

zhiyil1230 commented Feb 1, 2023 •

edited

Loading

s-maddrellmander commented Feb 2, 2023

joao-alex-cunha commented Feb 2, 2023

callumm-graphcore commented Feb 9, 2023

joao-alex-cunha commented Feb 16, 2023

DomInvivo commented Feb 27, 2023 •

edited

Loading

callumm-graphcore commented Mar 2, 2023

DomInvivo commented Mar 11, 2023 •

edited

Loading

hadim commented Mar 11, 2023

hadim commented Mar 11, 2023

DomInvivo commented Mar 11, 2023 •

edited

Loading

hadim commented Mar 20, 2023

DomInvivo commented May 2, 2023

Feedback about documentation #196

Feedback about documentation #196

Comments

zhiyil1230 commented Feb 1, 2023 • edited Loading

s-maddrellmander commented Feb 2, 2023

joao-alex-cunha commented Feb 2, 2023

callumm-graphcore commented Feb 9, 2023

joao-alex-cunha commented Feb 16, 2023

DomInvivo commented Feb 27, 2023 • edited Loading

callumm-graphcore commented Mar 2, 2023

DomInvivo commented Mar 11, 2023 • edited Loading

hadim commented Mar 11, 2023

hadim commented Mar 11, 2023

DomInvivo commented Mar 11, 2023 • edited Loading

hadim commented Mar 20, 2023

DomInvivo commented May 2, 2023

zhiyil1230 commented Feb 1, 2023 •

edited

Loading

DomInvivo commented Feb 27, 2023 •

edited

Loading

DomInvivo commented Mar 11, 2023 •

edited

Loading

DomInvivo commented Mar 11, 2023 •

edited

Loading