-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an unsupervised warm up for the models #140
base: master
Are you sure you want to change the base?
Conversation
This is super cool Gabi! Thanks so much. Just reviewing now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is amazing work dude!! Just a few qs - thanks so much for implementing
neighbour_indices: List[int] = [] | ||
distant_indices: List[int] = [] | ||
|
||
outer_distance = tuple(multiplier * val for val in distance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats the role of the multiplier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's basically to enforce a minimum distance between the neighbouring instance and the distant instance.
The neighbour will be within neighbouring_distance
of the anchor. The distant instance will be further than multiplier * neighbouring_distance
from the anchor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha! so basically enforcing how large an area our spatial differences should be over
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup!
@@ -288,11 +289,14 @@ def forward( | |||
|
|||
x = self.rnn_dropout(hidden_state[:, -1, :]) | |||
|
|||
if return_embedding: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this for interpreting the static embedding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - the loss in tile2vec compares the embedding, not the final value. This is to return that embedding for the loss, before it gets put through the final linear layer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes makes sense! Could this be used for interpreting the embedding layer too though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, 100%. Although here the "embedding" is the final output of the model before the linear regression layer
# initialize the model | ||
if self.model is None: | ||
x_ref, _, _ = next(iter(train_dataloader)) | ||
model = self._initialize_model(self._input_to_tuple(x_ref)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this train the LSTM model? Don't we need to initialise with a CNN as they use in Tile2Vec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The principles of tile2vec can be used with any model that takes a raw input and outputs an embedding.
So yea, in this case it can also train the (EA)LSTM model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay gotcha.
So have i interpreted this correctly:
"We use the unsupervised learning algorithm described in Tile2Vec to pretrain (initialise) the weights of the EALSTM. This allows us to produce weights in the network that produce sensible spatial patterns. Mainly that pixels close together are more similar than pixels that are far apart."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, that's exactly right
Inspired by tile2vec, pretrains the models by training the models to make embeddings that are far away from each other more different than embeddings that are close to one another.
It's a less rigid way of communicating the latlon information to the models