Comparison of capabilities of LLMs with the same parameter scale. #1

jianyuheng · 2024-06-05T09:43:08Z

For example, the capabilities of 7b and 13b-lowrank.

mobicham · 2024-06-05T10:26:07Z

Hi, can you please clarify what you mean? What is "parameter scale" and what is the issue exactly?

jianyuheng · 2024-06-06T05:12:07Z

I mean13b-lowrank equivalents to 6.5b, Have you compared ppl between 13b-lowrank and 7b?

mobicham · 2024-06-06T08:17:11Z

Ah I understand. No, we didn't. Our goal at that time was to get 3B version of the Llama2-7B chat model. Phi2 base was released like 5-3 months after this low-rank work. We trained our own instruct version instead which is better than a Llama2-7B chat for half the size: https://huggingface.co/mobiuslabsgmbh/aanaphi2-v0.1 and that achieved our goal instead.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison of capabilities of LLMs with the same parameter scale. #1

Comparison of capabilities of LLMs with the same parameter scale. #1

jianyuheng commented Jun 5, 2024

mobicham commented Jun 5, 2024

jianyuheng commented Jun 6, 2024

mobicham commented Jun 6, 2024

Comparison of capabilities of LLMs with the same parameter scale. #1

Comparison of capabilities of LLMs with the same parameter scale. #1

Comments

jianyuheng commented Jun 5, 2024

mobicham commented Jun 5, 2024

jianyuheng commented Jun 6, 2024

mobicham commented Jun 6, 2024