Linear Quantization
Visualization
Then, for each layer in your network (linear, conv, etc), you represent the matrices involved like the previous formulation, do some arithmetic to see what you can precompute and zero-out and voilá
Visualization
Then, for each layer in your network (linear, conv, etc), you represent the matrices involved like the previous formulation, do some arithmetic to see what you can precompute and zero-out and voilá