Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unify how to freeze some parameters for coca pre-training #526

Closed

Conversation

zhangtemplar
Copy link
Contributor

Summary:

  1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
  2. add configuration of using MLP instead of attention pooler for vision adapter;
  3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 13, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 14, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 14, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

@codecov-commenter
Copy link

codecov-commenter commented Mar 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.62%. Comparing base (dbeed97) to head (88933e9).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #526   +/-   ##
=======================================
  Coverage   75.61%   75.62%           
=======================================
  Files         234      234           
  Lines       16122    16126    +4     
=======================================
+ Hits        12191    12195    +4     
  Misses       3931     3931           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 20, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 21, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Mar 29, 2024
…search#526)

Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D54559503

zhangtemplar added a commit to zhangtemplar/multimodal that referenced this pull request Apr 8, 2024
Summary:

1. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).
2. add configuration of using MLP instead of attention pooler for vision adapter;

Differential Revision:
D55897450

Privacy Context Container: 303860477774201
facebook-github-bot pushed a commit that referenced this pull request Apr 25, 2024
Summary:
Pull Request resolved: #527

Pull Request resolved: #526

1. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).
2. add configuration of using MLP instead of attention pooler for vision adapter;

Reviewed By: Bellaktris

Differential Revision:
D55897450

Privacy Context Container: 303860477774201

fbshipit-source-id: 8e012b0c3d37566364f216dbfa8aec389142afe1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants