None
Thireus
AI & ML interests
None yet
Recent Activity
liked
a dataset
about 1 hour ago
Salesforce/wikitext
liked
a model
about 7 hours ago
anikifoss/DeepSeek-R1-0528-DQ4_K_R4
new activity
about 8 hours ago
Qwen/Qwen3-Embedding-0.6B:ONNX version planned?
Organizations
None yet
Thireus's activity
ONNX version planned?
2
#17 opened about 18 hours ago
by
Florianoli
Metrics for 110k context size?
#4 opened about 8 hours ago
by
Thireus
Scripts to produce PPL and KLD diagrams?
1
#10 opened 1 day ago
by
Thireus
DeepSeek-R1-0528-DQ2_K_R4
4
#3 opened 6 days ago
by
Thireus
DeepSeek R1 0528?
#15 opened 5 days ago
by
Thireus
This model almost completely loses Chinese ablities
๐
1
3
#14 opened 29 days ago
by
CHNtentes
Potentially still broken?
๐
3
13
#8 opened about 1 month ago
by
qenthousiast
Brainstorming
๐ง
5
5
#6 opened about 1 month ago
by
Downtown-Case
UD version not doing great with YaRN compared to non-UD of the same size
4
#4 opened about 1 month ago
by
Thireus
PPL vs model size - safe to assume larger size == better accuracy regardless of UD vs non-UD?
3
#6 opened about 1 month ago
by
Thireus
Potential issue with large context sizes - can someone confirm?
15
#18 opened about 1 month ago
by
Thireus
Qwen is loosing broad knowledge since Qwen2.
๐
๐ฅ
11
14
#16 opened about 1 month ago
by
phil111
YaRN not enabled correctly
๐
1
1
#3 opened about 1 month ago
by
CISCai
Any chance of a 128k version so we can use it as a draft model for the larger 128k models?
โ
3
8
#3 opened about 1 month ago
by
smcleod

Could we have the 128K variants as well please?
1
#1 opened about 1 month ago
by
Thireus
Context size? YaRN still supported?
2
#3 opened about 1 month ago
by
Thireus
CORRECTION: THIS SYSTEM MESSAGE IS ***PURE GOLD***!!!
๐
๐ค
17
16
#33 opened 9 months ago
by
jukofyork

FP8 Checkpoint version size mismatch?
2
#15 opened 10 months ago
by
Thireus
What is the "system" prompt?
3
#13 opened about 1 year ago
by
kk3dmax
First Word Ignored Issue / Single Word Instruction
๐
1
20
#11 opened about 1 year ago
by
pandora-s
