Arch-Router-1.5B

Arch-Router-1.5B introduces a preference-aligned routing framework that guides model selection by matching queries to user-defined domains (e.g., travel) or action types (e.g., image editing) -- offering a practical mechanism to encode preferences in routing decisions. Specifically, we introduce Arch-Router, a compact 1.5B model that learns to map queries to domain-action preferences for model routing decisions. Experiments on conversational datasets demonstrate that our approach achieves state-of-the-art (SOTA) results in matching queries with human preferences, outperforming top proprietary models.

Model Files

File Name Size Type Description
Arch-Router-1.5B.Q2_K.gguf 676 MB Model Q2_K quantized model (smallest)
Arch-Router-1.5B.Q3_K_S.gguf 761 MB Model Q3_K_S quantized model
Arch-Router-1.5B.Q3_K_M.gguf 824 MB Model Q3_K_M quantized model
Arch-Router-1.5B.Q3_K_L.gguf 880 MB Model Q3_K_L quantized model
Arch-Router-1.5B.Q4_K_S.gguf 940 MB Model Q4_K_S quantized model
Arch-Router-1.5B.Q4_K_M.gguf 986 MB Model Q4_K_M quantized model
Arch-Router-1.5B.Q5_K_S.gguf 1.1 GB Model Q5_K_S quantized model
Arch-Router-1.5B.Q5_K_M.gguf 1.13 GB Model Q5_K_M quantized model
Arch-Router-1.5B.Q6_K.gguf 1.27 GB Model Q6_K quantized model
Arch-Router-1.5B.Q8_0.gguf 1.65 GB Model Q8_0 quantized model
Arch-Router-1.5B.BF16.gguf 3.09 GB Model BF16 precision model
Arch-Router-1.5B.F16.gguf 3.09 GB Model F16 precision model
Arch-Router-1.5B.F32.gguf 6.18 GB Model F32 full precision model (largest)
.gitattributes 2.49 kB Config Git LFS configuration
config.json 31 Bytes Config Model configuration
README.md 173 Bytes Documentation Repository documentation

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
190
GGUF
Model size
1.54B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for prithivMLmods/Arch-Router-1.5B-GGUF

Base model

Qwen/Qwen2.5-1.5B
Quantized
(4)
this model