Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple ways to change gemmPrecision #91

Open
jerinphilip opened this issue Apr 14, 2022 · 0 comments
Open

Multiple ways to change gemmPrecision #91

jerinphilip opened this issue Apr 14, 2022 · 0 comments

Comments

@jerinphilip
Copy link

--gem-precision and --int* are two ways to do the same thing. Functionality would still work and be accessible without the following.

cli.add<bool>("--int8",
"Optimize speed even more aggressively sacrificing memory or precision by using 8bit integer GEMM with intgemm instead of floats. Only available on CPU. Corresponds to --gemm-precision int8");
cli.add<bool>("--int8Alpha",
"Use a precomputed quantisation multipliers for the activations. Requires a special model. Corresponds to --gemm-precision int8Alpha");
cli.add<bool>("--int8shift",
"Use a faster, shifted integer 8bit GEMM implementation. Corresponds to --gemm-precision int8shift");
cli.add<bool>("--int8shiftAlpha",
"Use a faster, shifted integer 8bit GEMM implementation, with precomputed alphas. Corresponds to --gemm-precision int8shiftAlpha");
cli.add<bool>("--int8shiftAll",
"Use a faster, shifted integer 8bit GEMM implementation even for matrices that don't have a bias. Beneficial on VNNI. Corresponds to --gemm-precision int8shiftAll");
cli.add<bool>("--int8shiftAlphaAll",
"Use a faster, shifted integer 8bit GEMM implementation even for matrices that don't have a bias, with precomputed alphas. Should be the fastest option. Corresponds to --gemm-precision int8shiftAlphaAll");
cli.add<std::string>("--gemm-precision",
"Use lower precision for the GEMM operations only. Supported values: float32, int16, int8, int8Alpha, int8shift, int8shiftAlpha, int8shiftAll, int8shiftAlphaAll", "float32");
cli.add<bool>("--dump-quantmult",

@jerinphilip jerinphilip added enhancement New feature or request and removed enhancement New feature or request labels Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant