# Quantization Quantization techniques reduce memory and computational costs by representing weights and activations with lower-precision data types like 8-bit integers (int8). This enables loading larger models you normally wouldn't be able to fit into memory, and speeding up inference. > [!TIP] > Learn how to quantize models in the [Quantization](../quantization/overview) guide. ## PipelineQuantizationConfig [[autodoc]] quantizers.PipelineQuantizationConfig ## BitsAndBytesConfig [[autodoc]] quantizers.quantization_config.BitsAndBytesConfig ## GGUFQuantizationConfig [[autodoc]] quantizers.quantization_config.GGUFQuantizationConfig ## QuantoConfig [[autodoc]] quantizers.quantization_config.QuantoConfig ## TorchAoConfig [[autodoc]] quantizers.quantization_config.TorchAoConfig ## DiffusersQuantizer [[autodoc]] quantizers.base.DiffusersQuantizer