Fix template when add_generation_prompt=true
#14 opened about 11 hours ago
by
matteogeniaccio
It supports Serbo-Croatian language very well!
2
1
#13 opened 1 day ago
by
JLouisBiz

GPTQ or AWQ Quants
1
#12 opened 1 day ago
by
guialfaro
Great job, thanks for this model.
3
#11 opened 3 days ago
by
Dampfinchen
recommended sampling parameters?
#10 opened 4 days ago
by
AaronFeng753
Can we have some more popular benchmarks
1
#8 opened 5 days ago
by
rombodawg

The model is the best for coding.
3
3
#7 opened 8 days ago
by
AekDevDev

When running with a single GPU, I get an error saying the VRAM is insufficient. However, when using multiple GPUs on a single machine, there are many errors. My vllm version is 0.8.4.
1
#6 opened 8 days ago
by
hanson888

BitsAndBytes quantization inference error
1
#5 opened 9 days ago
by
chengfy

Some bug when using function call with vllm==0.8.4
2
#4 opened 9 days ago
by
waple

SimpleQA Scores Are WAY off
4
5
#3 opened 11 days ago
by
phil111
Need fp8 version for inerface
1
#2 opened 11 days ago
by
iwaitu

RuntimeError: CUDA error: device-side assert triggered
#1 opened 12 days ago
by
DsnTgr