Q4F Q8A: Q4_K ffn, Q8_0 attn, Q8_0 output, Q8_0 embeds
Fits โฅ24K Q8 CTX on a 24GiB GPU
- Downloads last month
- 11
Hardware compatibility
Log In
to view the estimation
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support