Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison of capabilities of LLMs with the same parameter scale. #1

Open
jianyuheng opened this issue Jun 5, 2024 · 3 comments
Open

Comments

@jianyuheng
Copy link

For example, the capabilities of 7b and 13b-lowrank.

@mobicham
Copy link
Collaborator

mobicham commented Jun 5, 2024

Hi, can you please clarify what you mean? What is "parameter scale" and what is the issue exactly?

@jianyuheng
Copy link
Author

I mean13b-lowrank equivalents to 6.5b, Have you compared ppl between 13b-lowrank and 7b?

@mobicham
Copy link
Collaborator

mobicham commented Jun 6, 2024

Ah I understand. No, we didn't. Our goal at that time was to get 3B version of the Llama2-7B chat model. Phi2 base was released like 5-3 months after this low-rank work. We trained our own instruct version instead which is better than a Llama2-7B chat for half the size: https://huggingface.co/mobiuslabsgmbh/aanaphi2-v0.1 and that achieved our goal instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants