Melvin56
/

DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned-GGUF

Text Generation

Model card Files Files and versions

DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned-GGUF / README.md

Melvin56's picture

Update README.md

8717045 verified 8 months ago

|

history blame contribute delete

1.5 kB

	---
	base_model:
	- enkryptai/DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned
	language:
	- en
	pipeline_tag: text-generation
	---
	# Melvin56/DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned-GGUF

	Original Model : [enkryptai/DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned](https://huggingface.co/enkryptai/DeepSeek-R1-Distill-Llama-8B-Enkrypt-Aligned)

	All quants are made using the imatrix option.


	\| Model \| Size (GB) \|
	\|:-------------------------------------------------\|:-------------:\|
	\| Q2_K \| 3.17 \|
	\| Q3_K_M \| 4.02 \|
	\| Q4_K_M \| 4.92 \|
	\| Q5_K_M \| 5.72 \|
	\| Q6_K \| 6.59 \|
	\| Q8_0 \| 8.54 \|
	\| F16 \| 16.2 \|

	\| \| CPU (AVX2) \| CPU (ARM NEON) \| Metal \| cuBLAS \| rocBLAS \| SYCL \| CLBlast \| Vulkan \| Kompute \|
	\| :------------ \| :---------: \| :------------: \| :---: \| :----: \| :-----: \| :---: \| :------: \| :----: \| :------: \|
	\| K-quants \| ✅ \| ✅ \| ✅ \| ✅ \| ✅ \| ✅ \| ✅ 🐢5 \| ✅ 🐢5 \| ❌ \|
	\| I-quants \| ✅ 🐢4 \| ✅ 🐢4 \| ✅ 🐢4 \| ✅ \| ✅ \| Partial¹ \| ❌ \| ❌ \| ❌ \|
	```
	✅: feature works
	🚫: feature does not work
	❓: unknown, please contribute if you can test it youself
	🐢: feature is slow
	¹: IQ3_S and IQ1_S, see #5886
	²: Only with -ngl 0
	³: Inference is 50% slower
	⁴: Slower than K-quants of comparable size
	⁵: Slower than cuBLAS/rocBLAS on similar cards
	⁶: Only q8_0 and iq4_nl
	```