Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
cbensimon 
posted an update 4 days ago
Post
2813
🚀 ZeroGPU now supports PyTorch native quantization via torchao

While it hasn’t been battle-tested yet, Int8WeightOnlyConfig is already working flawlessly in our tests.

Let us know if you run into any issues — and we’re excited to see what the community will build!

import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_

pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)

@spaces.GPU
def generate(prompt: str):
    return pipeline(prompt).images[0]

cc @sayakpaul for visibility

check this one to do merge images https://mergejpg.org/

Great!

Edit:
One request.
I know that this is difficult (or rather, troublesome) due to the nature of Zero GPU, but if CUDA Toolkit, CuBLAS, and other tools could be used only during the build process, it would be helpful to be able to use the latest libraries.
Currently, for tools that require CUDA for the build and do not have pre-built wheels, I have to go through the trouble of forking the GitHub repository and building the binaries myself...

·

It makes sense and that's noted @John6666