Results for 14 pretrained models coming soon #16
alasdairforsythe
announced in
Announcements
Replies: 3 comments 1 reply
-
Any news about this experiment? 👀 |
Beta Was this translation helpful? Give feedback.
1 reply
-
Hooray!!! |
Beta Was this translation helpful? Give feedback.
0 replies
-
Any follow ups on this @alasdairforsythe ? For 80% one-word tokens and 20% multi-word tokens? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've been pretraining from scratch NanoGPT Small and Medium sized models with different TokenMonster vocabularies, and comparing against GPT-2 Tokenizer and TikToken p50k_base.
Results will be up in 1-2 weeks.
Beta Was this translation helpful? Give feedback.
All reactions