bndp
bndp
¡
AI & ML interests
None yet
Recent Activity
reacted
to
csabakecskemeti's
post
with đ
about 17 hours ago
I'm collecting llama-bench results for inference with a llama 3.1 8B q4 and q8 reference models on varoius GPUs. The results are average of 5 executions.
The system varies (different motherboard and CPU ... but that probably that has little effect on the inference performance).
https://devquasar.com/gpu-gguf-inference-comparison/
the exact models user are in the page
I'd welcome results from other GPUs is you have access do anything else you've need in the post. Hopefully this is useful information everyone.
upvoted
an
article
2 days ago
Open R1: How to use OlympicCoder locally for coding?
Organizations
None yet
models
None public yet
datasets
None public yet