Add InternLM2 model and tests #29667

x54-729 · 2024-03-15T06:23:05Z

What does this PR do?

Add tokenizer, fast tokenizer, configuration and model of InternLM2
Add tests for InternLM2 tokenizer and model
Complete README and model doc of InternLM2 (InternLM2 technical report will be released next week and I will update paper link to this PR)

All the tests have passed locally except some tokenizer issues(#29617, #29626). These failed tests are skipped temporarily.

I'm not very sure that my code format and tests are proper enough to merge, if there are any changes neede,d please let me know and I will fix them asap!

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@ArthurZucker

vansin · 2024-03-15T06:46:47Z

amazing！！！

ArthurZucker

Hey! Thanks for porting this model, is the only different with the Llama arch the non splitted QKV? If so, you have the Persimmon architecture that should be compatible out of the box with weight renaming!
What about the tokenizer, pretty sure the first InternLM used LlamaTokenizer converted by @Rocketknight1 no?

x54-729 · 2024-03-27T06:52:15Z

Hey! Thanks for porting this model, is the only different with the Llama arch the non splitted QKV? If so, you have the Persimmon architecture that should be compatible out of the box with weight renaming! What about the tokenizer, pretty sure the first InternLM used LlamaTokenizer converted by @Rocketknight1 no?

@ArthurZucker

Thanks for your reply!

Yes, QKV of InternLM2 are merged. According to 2.2 Model Structure section of our tech report: https://arxiv.org/pdf/2403.17297.pdf, The arrangement of merged QKV can accelerate inference. Also the merged QKV is convenient for tensor parallism.
Our tokenizer is different from llama since our add_dummy_prefix is False, so use llama tokenizer for internlm previously will cause some problems. Additionally, InternLM2 chat model uses some special tokens, so I have to use InternLM2Converter1 instead of LlamaConverter`.

ArthurZucker · 2024-03-27T14:53:33Z

Regarding 1. There is a Llama like model with fused QKV in transformers. Perismmon! If it's not exact match, I recommend starting from this arch with copied from!
2. If these are the only difference, GemmaTokenizer is the closest. You probably don't even need that and can just use AutoTokenizer with a PreTrainedTokenizerFast just using the tokenizer.json. Something similar to the gemma tokenizer should do the trick? I can probably help you with that if you want? 🤗

x54-729 · 2024-03-28T07:08:00Z

Regarding 1. There is a Llama like model with fused QKV in transformers. Perismmon! If it's not exact match, I recommend starting from this arch with copied from! 2. If these are the only difference, GemmaTokenizer is the closest. You probably don't even need that and can just use AutoTokenizer with a PreTrainedTokenizerFast just using the tokenizer.json. Something similar to the gemma tokenizer should do the trick? I can probably help you with that if you want? 🤗

These were what you recommended to do:

Using copied from command to write our model based on Perismmon if internlm2's architecture is not excatly the same as perismmon.
Upload tokenizer.json file to our model repositories to deal with add_dummy_prefix and special chat tokens, so there is no need to submit new tokenizer in this PR for internlm2.

Is that right?

ArthurZucker · 2024-03-28T07:38:19Z

Yes! I

transformers/src/transformers/models/auto/tokenization_auto.py

Line 173 in a25037b

    
           ("falcon", (None, "PreTrainedTokenizerFast" if is_tokenizers_available() else None)),

falcon only relies on a tokenizer.json ! If this does not work, then let's add InternLm2TokenizerFast, using copied from as well.

for 1. I can have another look to check what is the closest model to help, pretty sure it's persimmon (merge qkv) and phi (split qkv)

x54-729 · 2024-03-29T04:18:51Z

Yes! I

transformers/src/transformers/models/auto/tokenization_auto.py

Line 173 in a25037b

("falcon", (None, "PreTrainedTokenizerFast" if is_tokenizers_available() else None)),

falcon only relies on a tokenizer.json ! If this does not work, then let's add InternLm2TokenizerFast, using copied from as well.

for 1. I can have another look to check what is the closest model to help, pretty sure it's persimmon (merge qkv) and phi (split qkv)

Thanks! I will try tokenizer.json later !

For Perismmon model, it does not support GQA and the arrangement of qkv seems be different from internlm2.
On the other hand, names of state are not the same. Thus writing internlm2 based on LLaMA seems to be more convenient? I'm not pretty sure about this

ArthurZucker · 2024-03-30T05:58:43Z

Gemma or Phi should be the closest! @SunMarc will do the next round of review! Ping him if you need any help in the mean time! 🤗

SunMarc · 2024-04-17T10:17:16Z

Hi @x54-729, just checking out if you are still planning to finish this PR. Feel free to ask me any question =)

github-actions · 2024-06-07T08:05:45Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

x54-729 added 2 commits March 15, 2024 12:01

Add InternLM2 model&tests

344ef80

docs for internlm2

b03db44

x54-729 added 4 commits March 15, 2024 15:30

check_code_quality ci

6312bae

replace rearrange with reshape

527ca61

add revision to tests

7790a31

ruff format

be83e3f

ArthurZucker reviewed Mar 25, 2024

View reviewed changes

Dev-Khant mentioned this pull request Apr 7, 2024

Mixture of All Intelligence (MoAI) #29823

Open

2 tasks

huggingface deleted a comment from github-actions bot May 13, 2024

github-actions bot closed this Jun 15, 2024

SunMarc reopened this Jun 17, 2024

github-actions bot closed this Jun 26, 2024

SunMarc reopened this Jun 26, 2024

github-actions bot closed this Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add InternLM2 model and tests #29667

Add InternLM2 model and tests #29667

x54-729 commented Mar 15, 2024 •

edited

Loading

vansin commented Mar 15, 2024

ArthurZucker left a comment

x54-729 commented Mar 27, 2024

ArthurZucker commented Mar 27, 2024 •

edited

Loading

x54-729 commented Mar 28, 2024 •

edited

Loading

ArthurZucker commented Mar 28, 2024 •

edited

Loading

x54-729 commented Mar 29, 2024

ArthurZucker commented Mar 30, 2024

SunMarc commented Apr 17, 2024

github-actions bot commented Jun 7, 2024

Add InternLM2 model and tests #29667

Add InternLM2 model and tests #29667

Conversation

x54-729 commented Mar 15, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

vansin commented Mar 15, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

x54-729 commented Mar 27, 2024

ArthurZucker commented Mar 27, 2024 • edited Loading

x54-729 commented Mar 28, 2024 • edited Loading

ArthurZucker commented Mar 28, 2024 • edited Loading

x54-729 commented Mar 29, 2024

ArthurZucker commented Mar 30, 2024

SunMarc commented Apr 17, 2024

github-actions bot commented Jun 7, 2024

x54-729 commented Mar 15, 2024 •

edited

Loading

ArthurZucker commented Mar 27, 2024 •

edited

Loading

x54-729 commented Mar 28, 2024 •

edited

Loading

ArthurZucker commented Mar 28, 2024 •

edited

Loading