Second Brain

Neural Network Quantization

Neural Network Quantization

Related:
- HuggingFace Docs
- A survey of quantization methods for efficient neural network inference
- A recent (2024) work by Han et al: AWQ - Activation-aware Weight Quantization for LLM Compression and Acceleration