Arch-Router-1.5B

Arch-Router-1.5B introduces a preference-aligned routing framework that guides model selection by matching queries to user-defined domains (e.g., travel) or action types (e.g., image editing) -- offering a practical mechanism to encode preferences in routing decisions. Specifically, we introduce Arch-Router, a compact 1.5B model that learns to map queries to domain-action preferences for model routing decisions. Experiments on conversational datasets demonstrate that our approach achieves state-of-the-art (SOTA) results in matching queries with human preferences, outperforming top proprietary models.

Model Files

File Name	Size	Type	Description
Arch-Router-1.5B.Q2_K.gguf	676 MB	Model	Q2_K quantized model (smallest)
Arch-Router-1.5B.Q3_K_S.gguf	761 MB	Model	Q3_K_S quantized model
Arch-Router-1.5B.Q3_K_M.gguf	824 MB	Model	Q3_K_M quantized model
Arch-Router-1.5B.Q3_K_L.gguf	880 MB	Model	Q3_K_L quantized model
Arch-Router-1.5B.Q4_K_S.gguf	940 MB	Model	Q4_K_S quantized model
Arch-Router-1.5B.Q4_K_M.gguf	986 MB	Model	Q4_K_M quantized model
Arch-Router-1.5B.Q5_K_S.gguf	1.1 GB	Model	Q5_K_S quantized model
Arch-Router-1.5B.Q5_K_M.gguf	1.13 GB	Model	Q5_K_M quantized model
Arch-Router-1.5B.Q6_K.gguf	1.27 GB	Model	Q6_K quantized model
Arch-Router-1.5B.Q8_0.gguf	1.65 GB	Model	Q8_0 quantized model
Arch-Router-1.5B.BF16.gguf	3.09 GB	Model	BF16 precision model
Arch-Router-1.5B.F16.gguf	3.09 GB	Model	F16 precision model
Arch-Router-1.5B.F32.gguf	6.18 GB	Model	F32 full precision model (largest)
.gitattributes	2.49 kB	Config	Git LFS configuration
config.json	31 Bytes	Config	Model configuration
README.md	173 Bytes	Documentation	Repository documentation

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

prithivMLmods
/

Arch-Router-1.5B-GGUF

Arch-Router-1.5B

Model Files

Quants Usage

Model tree for prithivMLmods/Arch-Router-1.5B-GGUF