Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Improve the weight conversion interface #99

Open
jlamypoirier opened this issue Dec 21, 2024 · 0 comments
Open

[feat] Improve the weight conversion interface #99

jlamypoirier opened this issue Dec 21, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@jlamypoirier
Copy link
Collaborator

🧐 Problem Description

There are minor issues with the weight conversion. Mainly:

  • It's not consistent with ParamConverter, for example fast_llm_names with enforced tuple format vs fast_llm_name that accepts both tuple and str format.
  • The external converter is hard-coded to Safetensors. We probably want to support other file formats.

See also #98

💡 Proposed Solution

  • Make variable names plurial in WeightConverter.
  • Enforce tuple format
  • Generalize StateDictCheckpointHandler so it's not hard-coded to Safetensors

🔄 Alternatives Considered

Things work right now so we don't have to do anything, but it's a good idea to be proactive.

📈 Potential Benefits

Improved consistency and generalizability.

@jlamypoirier jlamypoirier added the enhancement New feature or request label Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant