DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. Beats o1 preview in math.
ThomasBaruzier
ThomasBaruzier
AI & ML interests
None yet
Recent Activity
liked
a model
17 days ago
arcee-ai/Virtuoso-Large
liked
a model
19 days ago
nvidia/AceReason-Nemotron-1.1-7B
liked
a model
19 days ago
moonshotai/Kimi-Dev-72B
Organizations
None yet
EXAONE-3.5
EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B.
Qwen QwQ
Qwen with Questions
Llama 3.2 Instruct
Llama 3.2 language models, featuring instruction-tuned models of 2 sizes, including 1B and 3B.
Llama 3 Instruct
Llama 3 language models, featuring instruction-tuned models of 2 sizes, including 8B and 70B.
DeepSeek-R1-ReDistill
Re-distilled DeepSeek R1 models
Qwen 2.5 Coder Instruct
Code-specific model series based on Qwen2.5
-
ThomasBaruzier/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 733 • 1 -
ThomasBaruzier/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 600 -
ThomasBaruzier/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 765 -
ThomasBaruzier/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 364
Qwen 2.5 Instruct
Qwen 2.5 language models, featuring instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.
-
ThomasBaruzier/Qwen2.5-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 1.56k -
ThomasBaruzier/Qwen2.5-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 444 -
ThomasBaruzier/Qwen2.5-3B-Instruct-GGUF
Text Generation • 3B • Updated • 58 -
ThomasBaruzier/Qwen2.5-7B-Instruct-GGUF
Text Generation • 8B • Updated • 157
Llama 3.1 Instruct
Llama 3.1 language models, featuring instruction-tuned models of 3 sizes, including 8B, 70B, and 405B.
Gemma 2
Gemma 2 language models, featuring instruction-tuned models of 3 sizes, including 2B, 9B, and 27B.
DeepScaleR-1.5B-Preview
DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. Beats o1 preview in math.
DeepSeek-R1-ReDistill
Re-distilled DeepSeek R1 models
EXAONE-3.5
EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B.
Qwen 2.5 Coder Instruct
Code-specific model series based on Qwen2.5
-
ThomasBaruzier/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 733 • 1 -
ThomasBaruzier/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 600 -
ThomasBaruzier/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 765 -
ThomasBaruzier/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 364
Qwen QwQ
Qwen with Questions
Qwen 2.5 Instruct
Qwen 2.5 language models, featuring instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.
-
ThomasBaruzier/Qwen2.5-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 1.56k -
ThomasBaruzier/Qwen2.5-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 444 -
ThomasBaruzier/Qwen2.5-3B-Instruct-GGUF
Text Generation • 3B • Updated • 58 -
ThomasBaruzier/Qwen2.5-7B-Instruct-GGUF
Text Generation • 8B • Updated • 157
Llama 3.2 Instruct
Llama 3.2 language models, featuring instruction-tuned models of 2 sizes, including 1B and 3B.
Llama 3.1 Instruct
Llama 3.1 language models, featuring instruction-tuned models of 3 sizes, including 8B, 70B, and 405B.
Llama 3 Instruct
Llama 3 language models, featuring instruction-tuned models of 2 sizes, including 8B and 70B.
Gemma 2
Gemma 2 language models, featuring instruction-tuned models of 3 sizes, including 2B, 9B, and 27B.