@cbensimon on Hugging Face: "🚀 ZeroGPU now supports PyTorch native quantization via `torchao` While it…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

cbensimon

posted an update 4 days ago

Post

2813

🚀 ZeroGPU now supports PyTorch native quantization via torchao

While it hasn’t been battle-tested yet, Int8WeightOnlyConfig is already working flawlessly in our tests.

Let us know if you run into any issues — and we’re excited to see what the community will build!

import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_

pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)

@spaces.GPU
def generate(prompt: str):
    return pipeline(prompt).images[0]

cbensimon

4 days ago

cc @sayakpaul for visibility

asgharali7072

4 days ago

check this one to do merge images https://mergejpg.org/

John6666

3 days ago

•

edited 3 days ago

Great!

Edit:
One request.
I know that this is difficult (or rather, troublesome) due to the nature of Zero GPU, but if CUDA Toolkit, CuBLAS, and other tools could be used only during the build process, it would be helpful to be able to use the latest libraries.
Currently, for tools that require CUDA for the build and do not have pre-built wheels, I have to go through the trouble of forking the GitHub repository and building the binaries myself...

cbensimon

3 days ago

It makes sense and that's noted @John6666

In this post