Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HF Integration #248

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sedrick-keh-tri
Copy link
Collaborator

Follows what OLMo does with their HF integration.

This allows us to work with HF without having to create new classes in the upstream transformers repo. We now directly read from this repo, so we also don't need to worry about the OpenLM codebase being updated in the future.

Usage is exactly the same as the standard HF usage, except with the additional import in the first line.

from open_lm_hf import *

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("tri-ml/openlm-7b-300b")
model = AutoModelForCausalLM.from_pretrained("tri-ml/openlm-7b-300b")
a = tokenizer("Hi, nice to meet you.", return_tensors='pt')
out = model.to("cuda").generate(a['input_ids'].to("cuda"))
print(tokenizer.decode(out[0]))

Some things not implemented yet:

  • some extra HF functions like ResizeTokenEmbeddings
  • The HF forward output tuple CausalLMOutputWithPast usually returns the full hidden states, but OpenLM's forward doesn't return that, so I left it as None for now.
  • There's also a chunk if labels is not None: that I just copied from OLMo and didn't test.

@sedrick-keh-tri
Copy link
Collaborator Author

HF model repo config.json should look something like this

{
  "dim": 4096,
  "n_layers": 32, 
  "n_heads": 32, 
  "vocab_size": 50432,
  "norm_eps": 1e-5,
  "seq_len": 2048,
  "weight_tying": false,
  "apply_qk_norm": true,
  "norm_type": "gain_only_lp_layer_norm",
  "positional_embedding_type": "rotary",
  "ffn_type": "swiglu"
}

These correspond to attributes in the Params class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant