Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Della merge method #366

Merged
merged 5 commits into from
Jul 20, 2024
Merged

Add Della merge method #366

merged 5 commits into from
Jul 20, 2024

Conversation

Tej-Deep
Copy link
Contributor

Adds a new merging method della. Della first ranks parameters in each row of delta parameters and assigns drop probabilities adaptively, inversely proportional to their magnitudes. Delta parameters with higher magnitudes are assigned lower drop probabilities. After assigning drop probabilities, the delta parameters are dropped and rescaled in a manner similar to the DARE method. The Della-merging paper can be found here

@cg123
Copy link
Collaborator

cg123 commented Jul 16, 2024

Thanks for the PR! Glad to get your method into mergekit.

It would be great to also add Della to the README with a link to your paper. Happy to do this myself if you'd like.

@Tej-Deep
Copy link
Contributor Author

Thank you. I have added Della to the README. Please feel free to make any changes.

@cg123
Copy link
Collaborator

cg123 commented Jul 20, 2024

Looks great! Thanks for the PR.

@cg123 cg123 merged commit 619f4e4 into arcee-ai:main Jul 20, 2024
4 checks passed
@Se-Hun
Copy link

Se-Hun commented Jul 29, 2024

@Tej-Deep

Hello, Thanks for sharing your approach and code.
I read your paper(https://arxiv.org/abs/2406.11617) and wanna apply this to my model.
So, How can i configure yaml file for mergekit ?
Can i get some examples ??

@Tej-Deep
Copy link
Contributor Author

@Se-Hun

Hello. The configure yaml for Della is similar to Ties and Dare. Here is an example config yaml file:

models:
  - model: MODEL_1_PATH
    parameters:
      weight: 1.0
  - model: MODEL_2_PATH
    parameters:
      weight: 1.0
merge_method: della
base_model: BASE_MODEL_PATH
parameters:
  density: 0.7
  lambda: 1.1
  epsilon: 0.2

@Se-Hun
Copy link

Se-Hun commented Jul 29, 2024

Oh, Thank you for quick answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants