Post
2764
🚀 ZeroGPU now supports PyTorch native quantization via
While it hasn’t been battle-tested yet,
Let us know if you run into any issues — and we’re excited to see what the community will build!
torchao
While it hasn’t been battle-tested yet,
Int8WeightOnlyConfig
is already working flawlessly in our tests.Let us know if you run into any issues — and we’re excited to see what the community will build!
import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_
pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)
@spaces.GPU
def generate(prompt: str):
return pipeline(prompt).images[0]