Spiral-Qwen3-4B-F32-GGUF
SPIRAL employs an actor-learner architecture for scalable self-play training. Parallel actors sample trajectories from a diverse set of games using vectorized environments. A single policy $\pi_t$ plays both roles, generating zero-sum, sparse reward game trajectories. The centralized learner processes these trajectories using Role-conditioned Advantage Estimation (RAE) to compute separate advantages, $A_0(s,a)$ and $A_1(s,a)$, for each role. These are then used for on-policy reinforcement learning updates.
Model Files
File | Size | Format |
---|---|---|
Spiral-Qwen3-4B.F32.gguf | 16.1 GB | 32-bit float |
Spiral-Qwen3-4B.BF16.gguf | 8.05 GB | BFloat16 |
Spiral-Qwen3-4B.F16.gguf | 8.05 GB | 16-bit float |
Spiral-Qwen3-4B.Q8_0.gguf | 4.28 GB | 8-bit quantized |
Spiral-Qwen3-4B.Q6_K.gguf | 3.31 GB | 6-bit quantized |
Spiral-Qwen3-4B.Q5_K_M.gguf | 2.89 GB | 5-bit quantized (medium) |
Spiral-Qwen3-4B.Q5_K_S.gguf | 2.82 GB | 5-bit quantized (small) |
Spiral-Qwen3-4B.Q4_K_M.gguf | 2.5 GB | 4-bit quantized (medium) |
Spiral-Qwen3-4B.Q4_K_S.gguf | 2.38 GB | 4-bit quantized (small) |
Spiral-Qwen3-4B.Q3_K_L.gguf | 2.24 GB | 3-bit quantized (large) |
Spiral-Qwen3-4B.Q3_K_M.gguf | 2.08 GB | 3-bit quantized (medium) |
Spiral-Qwen3-4B.Q3_K_S.gguf | 1.89 GB | 3-bit quantized (small) |
Spiral-Qwen3-4B.Q2_K.gguf | 1.67 GB | 2-bit quantized |
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):
- Downloads last month
- 0
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
32-bit