You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@CharlieFRuan From the docs, it can be seen that mlc supports q3f16 quantization . However, all the models used in web-llm are either q4 or q0. Is q3f16 quantization not supported in web-llm(only supported in mlc-llm)? I an asking this because if q3 is indeed supported in web-llm it should be used because generally mobile browsers are memory constrained.
The text was updated successfully, but these errors were encountered:
@CharlieFRuan From the docs, it can be seen that mlc supports q3f16 quantization . However, all the models used in web-llm are either q4 or q0. Is q3f16 quantization not supported in web-llm(only supported in mlc-llm)? I an asking this because if q3 is indeed supported in web-llm it should be used because generally mobile browsers are memory constrained.
The text was updated successfully, but these errors were encountered: