Collection of quantized Gemma 3 models created by Google.

Red Hat AI
company
Verified
AI & ML interests
OpenSource and AI
Recent Activity
View all activity
Organization Card
Red Hat AI
Build AI for your world
The Red Hat AI repository on Hugging Face is an open-source initiative backed by deep collaboration between IBM and Red Hat’s research, engineering, and business units. We’re committed to making AI more accessible, efficient, and community-driven from research to production.
We believe the future of AI is open. That’s why we’re sharing our latest models and research on Hugging Face, which are freely available to help researchers, developers, and organizations deploy high-performance AI at scale.
🔧 With Red Hat AI, you can:
- Use or build optimized foundation models, including Llama, Mistral, Qwen, Gemma, DeepSeek, and others, tailored for performance and accuracy in real-world deployments.
- Customize and fine-tune models for your workflows, from experimentation to production, with tools and frameworks built to support reproducible research and enterprise AI pipelines.
- Maximize inference efficiency across hardware using production-grade compression and optimization techniques like quantization (FP8, INT8, INT4), structured/unstructured sparsity, distillation, and more, ready for cost-efficient deployments with vLLM.
- Validated models by Red Hat AI offer confidence, predictability, and flexibility when deploying third-party generative AI models across the Red Hat AI platform. Red Hat AI validates models by running a series of capacity planning scenarios with GuideLLM for benchmarking, Language Model Evaluation Harness for accuracy evaluations, and vLLM for inference serving across a wide variety of AI acclerators.
🔗 Explore relevant open-source tools:
- vLLM – Serve large language models efficiently across GPUs and environments.
- LLM Compressor – Compress and optimize your own models with SOTA quantization and sparsity techniques.
- InstructLab – Fine-tune open models with your data using scalable, community-backed workflows.
- GuideLLM – Benchmark, evaluate, and guide your deployments with structured performance and latency insights.
Or learn more about our full product suite at https://www.redhat.com/en/products/ai
Collections
10
Collection of quantized whisper models created by OpenAI
-
RedHatAI/whisper-large-v3-turbo-quantized.w4a16
Automatic Speech Recognition • Updated • 1.7k • 1 -
RedHatAI/whisper-large-v3-turbo-quantized.w8a8
Automatic Speech Recognition • Updated • 169 • 2 -
RedHatAI/whisper-large-v3-turbo-FP8-Dynamic
Automatic Speech Recognition • Updated • 38 • 1 -
RedHatAI/whisper-tiny-FP8-Dynamic
Automatic Speech Recognition • Updated • 24
models
482

RedHatAI/Llama-4-Maverick-17B-128E-Instruct-quantized.w4a16
Image-Text-to-Text
•
Updated
•
54

RedHatAI/Magistral-Small-2506-FP8
Updated
•
6.67k
•
6

RedHatAI/gemma-3-4b-it-quantized.w8a8
Image-Text-to-Text
•
Updated
•
129

RedHatAI/gemma-3-12b-it-quantized.w8a8
Image-Text-to-Text
•
Updated
•
179

RedHatAI/gemma-3-27b-it-quantized.w8a8
Image-Text-to-Text
•
Updated
•
424
•
4

RedHatAI/gemma-3-27b-it-quantized.w4a16
Image-Text-to-Text
•
Updated
•
199

RedHatAI/gemma-3-12b-it-quantized.w4a16
Image-Text-to-Text
•
Updated
•
239

RedHatAI/gemma-3-4b-it-quantized.w4a16
Image-Text-to-Text
•
Updated
•
584
•
2

RedHatAI/gemma-3-4b-it-FP8-dynamic
Image-Text-to-Text
•
Updated
•
203

RedHatAI/gemma-3-12b-it-FP8-dynamic
Image-Text-to-Text
•
Updated
•
203
datasets
0
None public yet