Skip to content

Latest commit

 

History

History
123 lines (123 loc) · 5.37 KB

quantization_support.md

File metadata and controls

123 lines (123 loc) · 5.37 KB

TinyNN Quantization Support

Unsupported operators in PyTorch for static quantization

Quantized OPs that are natively not supported by PyTorch (and possibly TFLite). But some of them can be translated to quantized TFLite through extra configuration.

Operator Minimum Supported PyTorch Version
abs /
atan /
atan2 /
bmm /
clamp_max /
clamp_min /
cos /
elu /
exp /
glu /
group_norm /
hardsigmoid /
instance_norm /
layer_norm /
log /
log_softmax /
matmul /
mm /
norm /
pad 1.7.0
pow /
prelu /
reciprocal /
silu /
sin /
softmax /
sqrt /
std /
sum /
torch.nn.ConstantPad1d 1.7.0
torch.nn.ConstantPad2d 1.7.0
torch.nn.ConstantPad3d 1.7.0
torch.nn.ConvTranspose2d 1.7.0
torch.nn.GLU /
torch.nn.GRU 1.13.0
torch.nn.GroupNorm /
torch.nn.Hardsigmoid /
torch.nn.InstanceNorm1d /
torch.nn.InstanceNorm2d /
torch.nn.LSTM 1.13.0
torch.nn.LayerNorm /
torch.nn.LogSoftmax /
torch.nn.PReLU /
torch.nn.RMSNorm /
torch.nn.RNN /
torch.nn.SiLU /
torch.nn.Softmax /
torch.nn.ZeroPad2d 1.7.0
truediv /
var /

Extra flags for translating the above ops to quantized TFLite

Operators Notes
abs For TFLiteConverter, set rewrite_quantizable=True
bmm For TFLiteConverter, set rewrite_quantizable=True
clamp_max For TFLiteConverter, set rewrite_quantizable=True
clamp_min For TFLiteConverter, set rewrite_quantizable=True
elu No action needed
glu No action needed
log_softmax For QATQuantizer/PostQuantizer, set config={"set_quantizable_op_stats": True}
For TFLiteConverter, set rewrite_quantizable=True
matmul For TFLiteConverter, set rewrite_quantizable=True
prelu No action needed
silu No action needed
softmax For QATQuantizer/PostQuantizer, set config={"set_quantizable_op_stats": True}
For TFLiteConverter, set rewrite_quantizable=True
sum For TFLiteConverter, set rewrite_quantizable=True
torch.nn.GLU No action needed
torch.nn.LogSoftmax For QATQuantizer/PostQuantizer, set config={"set_quantizable_op_stats": True}
For TFLiteConverter, set rewrite_quantizable=True
torch.nn.PReLU No action needed
torch.nn.SiLU No action needed
torch.nn.Softmax For QATQuantizer/PostQuantizer, set config={"set_quantizable_op_stats": True}
For TFLiteConverter, set rewrite_quantizable=True
truediv For TFLiteConverter, set rewrite_quantizable=True
{sqrt, reciprocal} For TFLiteConverter, set rewrite_quantizable=True

Supported fusion rules for static quantization

Operators Notes
{add, clamp}
{add, relu6}
{add, torch.nn.ReLU6}
{torch.nn.BatchNorm2d, clamp}
{torch.nn.BatchNorm2d, torch.nn.Conv2d} PTQ only.
{torch.nn.BatchNorm2d, torch.nn.Conv2d, torch.nn.ReLU} PTQ only.
{torch.nn.BatchNorm2d, torch.nn.ReLU}
{torch.nn.BatchNorm2d, torch.nn.ReLU6}
{torch.nn.BatchNorm3d, torch.nn.ReLU}
{torch.nn.BatchNorm3d, torch.nn.ReLU6}
{torch.nn.Conv1d, torch.nn.BatchNorm1d}
{torch.nn.Conv1d, torch.nn.BatchNorm1d, torch.nn.ReLU}
{torch.nn.Conv1d, torch.nn.BatchNorm1d, torch.nn.ReLU6}
{torch.nn.Conv1d, torch.nn.ReLU}
{torch.nn.Conv1d, torch.nn.ReLU6}
{torch.nn.Conv2d, clamp}
{torch.nn.Conv2d, torch.nn.BatchNorm2d}
{torch.nn.Conv2d, torch.nn.BatchNorm2d, clamp}
{torch.nn.Conv2d, torch.nn.BatchNorm2d, torch.nn.ReLU}
{torch.nn.Conv2d, torch.nn.BatchNorm2d, torch.nn.ReLU6}
{torch.nn.Conv2d, torch.nn.ReLU}
{torch.nn.Conv2d, torch.nn.ReLU6}
{torch.nn.Conv3d, torch.nn.BatchNorm3d}
{torch.nn.Conv3d, torch.nn.BatchNorm3d, torch.nn.ReLU}
{torch.nn.Conv3d, torch.nn.BatchNorm3d, torch.nn.ReLU6}
{torch.nn.Conv3d, torch.nn.ReLU}
{torch.nn.Conv3d, torch.nn.ReLU6}
{torch.nn.ConvTranspose1d, torch.nn.BatchNorm1d} PTQ only. Only PyTorch 1.11.0+ is supported
{torch.nn.ConvTranspose2d, clamp}
{torch.nn.ConvTranspose2d, torch.nn.BatchNorm2d}
{torch.nn.ConvTranspose2d, torch.nn.BatchNorm2d, clamp}
{torch.nn.ConvTranspose2d, torch.nn.BatchNorm2d, torch.nn.ReLU}
{torch.nn.ConvTranspose2d, torch.nn.BatchNorm2d, torch.nn.ReLU6}
{torch.nn.ConvTranspose2d, torch.nn.ReLU}
{torch.nn.ConvTranspose2d, torch.nn.ReLU6}
{torch.nn.ConvTranspose3d, torch.nn.BatchNorm3d} PTQ only. Only PyTorch 1.11.0+ is supported
{torch.nn.Linear, clamp}
{torch.nn.Linear, torch.nn.BatchNorm1d} for PTQ, only PyTorch 1.8.0+ is supported
{torch.nn.Linear, torch.nn.BatchNorm1d, clamp}
{torch.nn.Linear, torch.nn.BatchNorm1d, torch.nn.ReLU6}
{torch.nn.Linear, torch.nn.ReLU}
{torch.nn.Linear, torch.nn.ReLU6}