|
--- |
|
base_model: |
|
- enkryptai/DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
# Melvin56/DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned-GGUF |
|
|
|
Original Model : [enkryptai/DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned](https://huggingface.co/enkryptai/DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned) |
|
|
|
All quants are made using the imatrix option. |
|
|
|
|
|
| Model | Size (GB) | |
|
|:-------------------------------------------------|:-------------:| |
|
| Q2_K | 3.17 | |
|
| Q3_K_M | 4.02 | |
|
| Q4_K_M | 4.92 | |
|
| Q5_K_M | 5.72 | |
|
| Q6_K | 6.59 | |
|
| Q8_0 | 8.54 | |
|
| F16 | 16.2 | |
|
|
|
| | CPU (AVX2) | CPU (ARM NEON) | Metal | cuBLAS | rocBLAS | SYCL | CLBlast | Vulkan | Kompute | |
|
| :------------ | :---------: | :------------: | :---: | :----: | :-----: | :---: | :------: | :----: | :------: | |
|
| K-quants | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ 🐢5 | ✅ 🐢5 | ❌ | |
|
| I-quants | ✅ 🐢4 | ✅ 🐢4 | ✅ 🐢4 | ✅ | ✅ | Partial¹ | ❌ | ❌ | ❌ | |
|
``` |
|
✅: feature works |
|
🚫: feature does not work |
|
❓: unknown, please contribute if you can test it youself |
|
🐢: feature is slow |
|
¹: IQ3_S and IQ1_S, see #5886 |
|
²: Only with -ngl 0 |
|
³: Inference is 50% slower |
|
⁴: Slower than K-quants of comparable size |
|
⁵: Slower than cuBLAS/rocBLAS on similar cards |
|
⁶: Only q8_0 and iq4_nl |
|
``` |