DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. Beats o1 preview in math.
ThomasBaruzier
ThomasBaruzier
AI & ML interests
None yet
Recent Activity
liked
a model
7 days ago
Qwen/Qwen3-VL-30B-A3B-Thinking
liked
a model
7 days ago
Qwen/Qwen3-VL-30B-A3B-Instruct
liked
a model
8 days ago
ubergarm/GLM-4.6-GGUF
Organizations
None yet
EXAONE-3.5
EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B.
Qwen QwQ
Qwen with Questions
Llama 3.2 Instruct
Llama 3.2 language models, featuring instruction-tuned models of 2 sizes, including 1B and 3B.
Llama 3 Instruct
Llama 3 language models, featuring instruction-tuned models of 2 sizes, including 8B and 70B.
DeepSeek-R1-ReDistill
Re-distilled DeepSeek R1 models
Qwen 2.5 Coder Instruct
Code-specific model series based on Qwen2.5
-
ThomasBaruzier/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 2.41k • 1 -
ThomasBaruzier/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 2.29k -
ThomasBaruzier/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 2.37k -
ThomasBaruzier/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 1.98k
Qwen 2.5 Instruct
Qwen 2.5 language models, featuring instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.
-
ThomasBaruzier/Qwen2.5-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 114 -
ThomasBaruzier/Qwen2.5-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 256 -
ThomasBaruzier/Qwen2.5-3B-Instruct-GGUF
Text Generation • 3B • Updated • 88 -
ThomasBaruzier/Qwen2.5-7B-Instruct-GGUF
Text Generation • 8B • Updated • 213
Llama 3.1 Instruct
Llama 3.1 language models, featuring instruction-tuned models of 3 sizes, including 8B, 70B, and 405B.
Gemma 2
Gemma 2 language models, featuring instruction-tuned models of 3 sizes, including 2B, 9B, and 27B.
DeepScaleR-1.5B-Preview
DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. Beats o1 preview in math.
DeepSeek-R1-ReDistill
Re-distilled DeepSeek R1 models
EXAONE-3.5
EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B.
Qwen 2.5 Coder Instruct
Code-specific model series based on Qwen2.5
-
ThomasBaruzier/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 2.41k • 1 -
ThomasBaruzier/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 2.29k -
ThomasBaruzier/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 2.37k -
ThomasBaruzier/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 1.98k
Qwen QwQ
Qwen with Questions
Qwen 2.5 Instruct
Qwen 2.5 language models, featuring instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.
-
ThomasBaruzier/Qwen2.5-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 114 -
ThomasBaruzier/Qwen2.5-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 256 -
ThomasBaruzier/Qwen2.5-3B-Instruct-GGUF
Text Generation • 3B • Updated • 88 -
ThomasBaruzier/Qwen2.5-7B-Instruct-GGUF
Text Generation • 8B • Updated • 213
Llama 3.2 Instruct
Llama 3.2 language models, featuring instruction-tuned models of 2 sizes, including 1B and 3B.
Llama 3.1 Instruct
Llama 3.1 language models, featuring instruction-tuned models of 3 sizes, including 8B, 70B, and 405B.
Llama 3 Instruct
Llama 3 language models, featuring instruction-tuned models of 2 sizes, including 8B and 70B.
Gemma 2
Gemma 2 language models, featuring instruction-tuned models of 3 sizes, including 2B, 9B, and 27B.