Outlook

We have quantised the model in 2-bit to make it inferenceable in low-end GPU cards at scale. It was achieved thanks to llama.cpp library.

Downloads last month
1
GGUF
Model size
27B params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

2-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sleeping-ai/Gemma3-27B-IT-TQ2-0

Quantized
(94)
this model