The reduced self attention dimensions are top notch
🤗
1
1
#5 opened about 15 hours ago
by
owao
This model is not much better than qwen3 32b for writing code
👍
1
2
#4 opened about 23 hours ago
by
xldistance
Transformers does not recognize model type `exaone4` architecture
1
#3 opened about 24 hours ago
by
BrandNewGD
Fix jinja bugs,Reasoning mode is used by default
2
#2 opened 1 day ago
by
xldistance