-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I-JEPA #1273
I-JEPA #1273
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
draft version of pr
Hey @Natyren, can we help you with the PR? Adding I-JEPA to the library would be awesome :) |
Hey, @IgorSusmelj! Sorry for the long wait, I've been really busy. Just wanted to give you a heads up that I'll be adding a new model to the framework next week. |
todo: add trainloop, debug
Added collator and model (encoder and predictor, according to original paper and code). |
todo: train and debug
in debug process now
@guarin Please review the completed implementation and the provided example exclusively using the Torch framework. What should be done next? |
Furthermore, I have examined the current setup in the examples by training the model on an A100 for 10 epochs. During this training, the loss consistently decreases, indicating that the model is effectively learning. |
Thanks a lot for the PR @Natyren! This looks great! I'll have a closer look ASAP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good!
I would suggest that we merge it as is and mark it as experimental until we have a fully tested version. There are also some points we'll have to address in follow-up issues/PRs (let me know if you are interested in helping out @Natyren):
- Torchvision ViT has no stoachstic depth (here) while IJEPA ViT has it (here). I had the same issue when working on MAE. The annoying thing is that drop path is not even used during pretraining but only during finetunig. Without it we'll sadly not be able to properly evaluate the model. For MAE I added drop path only during finetuning but it is pretty hacky, see here. I guess our best bet is to switch to timm vision transformer which already has support for it, see Add timm dependency? #1301
- I believe torchvision ViT also doesn't do this part of the initialization but I have to check in detail.
- The collator looks like it should be in the transform instead. The only thing I can find that depends on the whole batch is the number of masks here. But maybe I am missing something.
- The positional embedding functions should be moved from
lightly/models/modules/i_jepa
tolightly/models/utils
. Note that I already partially implemented them in the MAE PR: Guarin lig 3056 add mae imagenet benchmark #1263 - Some docstrings are missing or in the wrong format. For example the
apply_masks
method orIJEPAMaskCollator
. - We should also add references to the original paper/code in major classes/functions. For example in
IJEPAMaskCollator
. - Use
ijepa
instead ofi_jepa
for filenames to make it more consistent with the class names. - Use camel case in all class names. For example, change
IJEPA_predictor
toIJEPAPredictor
etc. - Add tests
- Finish examples and docs
- Add imagenet benchmark to
lightly/benchmarks
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #1273 +/- ##
==========================================
- Coverage 87.37% 83.52% -3.85%
==========================================
Files 126 128 +2
Lines 5275 5518 +243
==========================================
Hits 4609 4609
- Misses 666 909 +243
☔ View full report in Codecov by Sentry. |
@guarin Hi! This is cool, let's advance and start working on it! What are our next steps? |
Let's do these and then merge:
Would you have time to work on it? Otherwise I can do it. |
@guarin I will do it today, thank you for quick reply |
@guarin Done. Please take a look |
Awesome! Thanks for the super fast update. I added some small changes and the note about experimental support. Tested the code and it is running 🥳 Will merge once the tests pass. |
closes #issue_number
Description
Documentation
Implications / comments / further issues