-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unify how to freeze some parameters for coca pre-training #526
Conversation
This pull request was exported from Phabricator. Differential Revision: D54559503 |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
f4a1103
to
3953609
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
3953609
to
44b179b
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #526 +/- ##
=======================================
Coverage 75.61% 75.62%
=======================================
Files 234 234
Lines 16122 16126 +4
=======================================
+ Hits 12191 12195 +4
Misses 3931 3931 ☔ View full report in Codecov by Sentry. |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
44b179b
to
da89229
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
da89229
to
88933e9
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
…search#526) Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
88933e9
to
abc1037
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
abc1037
to
dbeed97
Compare
This pull request was exported from Phabricator. Differential Revision: D54559503 |
Summary: 1. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). 2. add configuration of using MLP instead of attention pooler for vision adapter; Differential Revision: D55897450 Privacy Context Container: 303860477774201
Summary: Pull Request resolved: #527 Pull Request resolved: #526 1. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). 2. add configuration of using MLP instead of attention pooler for vision adapter; Reviewed By: Bellaktris Differential Revision: D55897450 Privacy Context Container: 303860477774201 fbshipit-source-id: 8e012b0c3d37566364f216dbfa8aec389142afe1
Summary:
Differential Revision:
D54559503
Privacy Context Container: 303860477774201