None's picture

None

Thireus

·

AI & ML interests

None yet

Recent Activity

liked a dataset about 1 hour ago

Salesforce/wikitext

liked a model about 7 hours ago

anikifoss/DeepSeek-R1-0528-DQ4_K_R4

new activity about 8 hours ago

Qwen/Qwen3-Embedding-0.6B:ONNX version planned?

View all activity

Organizations

None yet

Thireus's activity

New activity in Qwen/Qwen3-Embedding-0.6B about 8 hours ago

ONNX version planned?

#17 opened about 18 hours ago by

New activity in anikifoss/DeepSeek-R1-0528-DQ4_K_R4 about 8 hours ago

Metrics for 110k context size?

#4 opened about 8 hours ago by

New activity in ubergarm/DeepSeek-R1-0528-GGUF 1 day ago

Scripts to produce PPL and KLD diagrams?

#10 opened 1 day ago by

New activity in anikifoss/DeepSeek-R1-0528-DQ4_K_R4 5 days ago

DeepSeek-R1-0528-DQ2_K_R4

#3 opened 6 days ago by

New activity in kalomaze/Qwen3-16B-A3B 5 days ago

DeepSeek R1 0528?

#15 opened 5 days ago by

This model almost completely loses Chinese ablities

#14 opened 29 days ago by

New activity in unsloth/Qwen3-32B-GGUF about 1 month ago

Potentially still broken?

#8 opened about 1 month ago by

New activity in kalomaze/Qwen3-16B-A3B about 1 month ago

Brainstorming

#6 opened about 1 month ago by

New activity in unsloth/Qwen3-32B-128K-GGUF about 1 month ago

UD version not doing great with YaRN compared to non-UD of the same size

#4 opened about 1 month ago by

New activity in unsloth/Qwen3-32B-GGUF about 1 month ago

PPL vs model size - safe to assume larger size == better accuracy regardless of UD vs non-UD?

#6 opened about 1 month ago by

New activity in Qwen/Qwen3-32B about 1 month ago

Potential issue with large context sizes - can someone confirm?

#18 opened about 1 month ago by

New activity in Qwen/Qwen3-235B-A22B about 1 month ago

Qwen is loosing broad knowledge since Qwen2.

#16 opened about 1 month ago by

New activity in unsloth/Qwen3-235B-A22B-128K-GGUF about 1 month ago

YaRN not enabled correctly

#3 opened about 1 month ago by

New activity in unsloth/Qwen3-0.6B-GGUF about 1 month ago

Any chance of a 128k version so we can use it as a draft model for the larger 128k models?

#3 opened about 1 month ago by

New activity in unsloth/Qwen3-16B-A3B-GGUF about 1 month ago

Could we have the 128K variants as well please?

#1 opened about 1 month ago by

New activity in kalomaze/Qwen3-16B-A3B about 1 month ago

Context size? YaRN still supported?

#3 opened about 1 month ago by

New activity in mattshumer/Reflection-Llama-3.1-70B 9 months ago

CORRECTION: THIS SYSTEM MESSAGE IS PURE GOLD!!!

#33 opened 9 months ago by

New activity in Kijai/flux-fp8 10 months ago

FP8 Checkpoint version size mismatch?

#15 opened 10 months ago by

New activity in meta-llama/Meta-Llama-3-70B-Instruct about 1 year ago

What is the "system" prompt?

#13 opened about 1 year ago by

New activity in mistralai/Mixtral-8x22B-Instruct-v0.1 about 1 year ago

First Word Ignored Issue / Single Word Instruction

#11 opened about 1 year ago by