DevQuasar/deepseek-ai.DeepSeek-R1-Zero-bf16
Text Generation
โข
Updated
โข
1
restarted the space, and regarding the speed I found forgot to offload the model to gpu :D
try now
Here you can try
https://huggingface.co/spaces/DevQuasar/Mi50
Bust something seems off with my network or with HF everything is very slow.
When llama benched the model I've get 60t/s on the mi50.
Anyway you can try it.
ROCR_VISIBLE_DEVICES=0 build/bin/llama-bench -m ~/Downloads/DevQuasar-R1-Uncensored-Llama-8B.Q8_0.gguf
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
Device 0: AMD Radeon VII, compute capability 9.0, VMM: no
model | size | params | backend | ngl | test | t/s |
---|---|---|---|---|---|---|
llama 8B Q8_0 | 7.95 GiB | 8.03 B | ROCm | 99 | pp512 | 416.30 ยฑ 0.07 |
llama 8B Q8_0 | 7.95 GiB | 8.03 B | ROCm | 99 | tg128 | 60.13 ยฑ 0.02 |