Spiral-Qwen3-4B-F32-GGUF

SPIRAL employs an actor-learner architecture for scalable self-play training. Parallel actors sample trajectories from a diverse set of games using vectorized environments. A single policy $\pi_t$ plays both roles, generating zero-sum, sparse reward game trajectories. The centralized learner processes these trajectories using Role-conditioned Advantage Estimation (RAE) to compute separate advantages, $A_0(s,a)$ and $A_1(s,a)$, for each role. These are then used for on-policy reinforcement learning updates.

Model Files

File	Size	Format
Spiral-Qwen3-4B.F32.gguf	16.1 GB	32-bit float
Spiral-Qwen3-4B.BF16.gguf	8.05 GB	BFloat16
Spiral-Qwen3-4B.F16.gguf	8.05 GB	16-bit float
Spiral-Qwen3-4B.Q8_0.gguf	4.28 GB	8-bit quantized
Spiral-Qwen3-4B.Q6_K.gguf	3.31 GB	6-bit quantized
Spiral-Qwen3-4B.Q5_K_M.gguf	2.89 GB	5-bit quantized (medium)
Spiral-Qwen3-4B.Q5_K_S.gguf	2.82 GB	5-bit quantized (small)
Spiral-Qwen3-4B.Q4_K_M.gguf	2.5 GB	4-bit quantized (medium)
Spiral-Qwen3-4B.Q4_K_S.gguf	2.38 GB	4-bit quantized (small)
Spiral-Qwen3-4B.Q3_K_L.gguf	2.24 GB	3-bit quantized (large)
Spiral-Qwen3-4B.Q3_K_M.gguf	2.08 GB	3-bit quantized (medium)
Spiral-Qwen3-4B.Q3_K_S.gguf	1.89 GB	3-bit quantized (small)
Spiral-Qwen3-4B.Q2_K.gguf	1.67 GB	2-bit quantized

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

prithivMLmods
/

Spiral-Qwen3-4B-F32-GGUF

Spiral-Qwen3-4B-F32-GGUF

Model Files

Quants Usage

Model tree for prithivMLmods/Spiral-Qwen3-4B-F32-GGUF