Change log to Qwen 3
pinned🔥
3
2
#20 opened about 1 month ago
by
danielhanchen

No IQ2_XXS version?
🚀
1
4
#19 opened about 2 months ago
by
xldistance
ollama? please?
3
#17 opened about 2 months ago
by
AlgorithmicKing

q1 quants?
6
#16 opened about 2 months ago
by
AliceBeta
Amazing, uses humble 255Gb RAM on my ancient 2014 Xeon PC for Q8
👍
❤️
1
1
#15 opened about 2 months ago
by
krustik
Quality benefits of UD-Q4_K_XL vs Q5_K_M vs Q6_K for this model?
👀
4
#14 opened about 2 months ago
by
ideosphere
Finetuning possible?
2
#12 opened 2 months ago
by
edgeinfinity
Why are XL quants smaller than M quants?
4
#11 opened 2 months ago
by
ChuckMcSneed

config.json "max_position_embeddings": 40960,
2
#10 opened 2 months ago
by
koushd
how to disable <think> with llama.cpp
4
#9 opened 2 months ago
by
bobchenyx

It seems like model have serious repetition issues (both gguf and on openrouter)
6
#8 opened 2 months ago
by
roadtoagi

[Qwen3-235B-A22B-UD-Q4_K_XL.gguf] UD Quant seems to be invalid.
2
#7 opened 2 months ago
by
XelotX
Test on 3090 + Tesla P40 (48gb vram total) + 64gb ram (Q2K)
1
#6 opened 2 months ago
by
roadtoagi

Ud quants please🥺
2
#5 opened 2 months ago
by
Ainonake
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed!
1
#4 opened 2 months ago
by
shakhizat
Do the Q4 quants work? On the 30b moe it says not to use them.
2
#3 opened 2 months ago
by
Lockout

UD quants missing some files
➕
3
6
#2 opened 2 months ago
by
MLDataScientist
Add languages tag
#1 opened 2 months ago
by
de-francophones
