You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Asking this here because gpt 3.5 as far as i remember is trained on much bigger dataset and can generate much large text . I do agree it is old now but still i was very satisfied with its bigger response and way of detailed explanation compared to today's gen smaller 8b models.
So i wonder if these benchmarks are legit because i have used countless small llm models which shows benchmark on par with gpt 4 but cant do sometimes even many basic things because it is quite practical that with such low dataset and only some billion parameters by no mean are it is enough to satisfy in same way as bigger models like llama,gpt,claud can and only possible way to achieve benchmarks in today's world is through training for benchmark hunting which is not healthy for open source development in my opinion but i have to admit this is the case for most unfortunately.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Asking this here because gpt 3.5 as far as i remember is trained on much bigger dataset and can generate much large text . I do agree it is old now but still i was very satisfied with its bigger response and way of detailed explanation compared to today's gen smaller 8b models.
So i wonder if these benchmarks are legit because i have used countless small llm models which shows benchmark on par with gpt 4 but cant do sometimes even many basic things because it is quite practical that with such low dataset and only some billion parameters by no mean are it is enough to satisfy in same way as bigger models like llama,gpt,claud can and only possible way to achieve benchmarks in today's world is through training for benchmark hunting which is not healthy for open source development in my opinion but i have to admit this is the case for most unfortunately.
Beta Was this translation helpful? Give feedback.
All reactions