lex-au commited on
Commit
ae7e231
Β·
verified Β·
1 Parent(s): f1bf6a5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ - es
7
+ base_model:
8
+ - shuttleai/shuttle-3.5
9
+ tags:
10
+ - Qwen
11
+ - Shuttle
12
+ - GGUF
13
+ - 32b
14
+ - quantized
15
+ - Q8_0
16
+ ---
17
+
18
+ # Shuttle 3.5 β€” Q8_0 GGUF Quant
19
+
20
+ This repo contains a GGUF quantized version of [ShuttleAI's Shuttle 3.5 model](https://huggingface.co/shuttleai/shuttle-3.5), a high-performance instruction-tuned variant of Qwen 3 32B. This quant was built for efficient local inference without sacrificing quality.
21
+
22
+ ## πŸ”— Base Model
23
+
24
+ - **Original**: [shuttleai/shuttle-3.5](https://huggingface.co/shuttleai/shuttle-3.5)
25
+ - **Parent architecture**: Qwen 3 32B
26
+ - **Quantized by**: Lex-au
27
+ - **Quantization format**: GGUF Q8_0
28
+
29
+ ## πŸ“¦ Model Size
30
+
31
+ | Format | Size |
32
+ |----------|----------|
33
+ | Original (safetensors, F16) | 65.52 GB |
34
+ | Q8_0 (GGUF) | 34.8 GB |
35
+
36
+ **Compression Ratio**: ~47%
37
+ **Size Reduction**: ~18% absolute (30.72 GB saved)
38
+
39
+ ## πŸ§ͺ Quality
40
+
41
+ - Q8_0 is **near-lossless**, preserving almost all performance of the full-precision model.
42
+ - Ideal for high-quality inference on capable consumer hardware.
43
+
44
+ ## πŸš€ Usage
45
+
46
+ Compatible with all major GGUF-supporting runtimes, including:
47
+
48
+ - `llama.cpp`
49
+ - `KoboldCPP`
50
+ - `text-generation-webui`
51
+ - `llamafile`
52
+ - `LM Studio`
53
+
54
+ Example with `llama.cpp`:
55
+
56
+ ```bash
57
+ ./main -m shuttle-3.5.Q8_0.gguf --ctx-size 4096 --threads 16 --prompt "Describe the effects of quantum decoherence in plain English."