lex-au
/

shuttle-3.5-Q8_0-GGUF

Model card Files Files and versions Community

lex-au commited on May 1

Commit

ae7e231

·

verified ·

1 Parent(s): f1bf6a5

Create README.md

Files changed (1) hide show

README.md +57 -0

README.md ADDED Viewed

	@@ -0,0 +1,57 @@

+---
+license: apache-2.0
+language:
+- en
+- zh
+- es
+base_model:
+- shuttleai/shuttle-3.5
+tags:
+- Qwen
+- Shuttle
+- GGUF
+- 32b
+- quantized
+- Q8_0
+---
+# Shuttle 3.5 — Q8_0 GGUF Quant
+This repo contains a GGUF quantized version of [ShuttleAI's Shuttle 3.5 model](https://huggingface.co/shuttleai/shuttle-3.5), a high-performance instruction-tuned variant of Qwen 3 32B. This quant was built for efficient local inference without sacrificing quality.
+## 🔗 Base Model
+- **Original**: [shuttleai/shuttle-3.5](https://huggingface.co/shuttleai/shuttle-3.5)
+- **Parent architecture**: Qwen 3 32B
+- **Quantized by**: Lex-au
+- **Quantization format**: GGUF Q8_0
+## 📦 Model Size
+| Format   | Size     |
+|----------|----------|
+| Original (safetensors, F16) | 65.52 GB |
+| Q8_0 (GGUF)                 | 34.8 GB  |
+**Compression Ratio**: ~47%
+**Size Reduction**: ~18% absolute (30.72 GB saved)
+## 🧪 Quality
+- Q8_0 is **near-lossless**, preserving almost all performance of the full-precision model.
+- Ideal for high-quality inference on capable consumer hardware.
+## 🚀 Usage
+Compatible with all major GGUF-supporting runtimes, including:
+- `llama.cpp`
+- `KoboldCPP`
+- `text-generation-webui`
+- `llamafile`
+- `LM Studio`
+Example with `llama.cpp`:
+```bash
+./main -m shuttle-3.5.Q8_0.gguf --ctx-size 4096 --threads 16 --prompt "Describe the effects of quantum decoherence in plain English."