Q6_K_C: Q6_K weights, untouched embeds, untouched output

Fits โ‰ฅ24K CTX on a 24GiB GPU

Downloads last month
34
GGUF
Model size
23.6B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support