Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
sunblaze-ucb
's Collections
Intuitor
Intuitor
updated
Jun 25
Models in the paper "Learning to Reason without External Rewards"
Upvote
-
sunblaze-ucb/Qwen2.5-3B-Intuitor-MATH-1EPOCH
Text Generation
•
3B
•
Updated
Jun 16
•
335
•
1
sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH
Text Generation
•
2B
•
Updated
Jun 16
•
11
sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH
Text Generation
•
15B
•
Updated
Jun 16
•
56
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH
Text Generation
•
7B
•
Updated
Jun 16
•
18
sunblaze-ucb/Qwen3-14B-GRPO-MATH-1EPOCH
Text Generation
•
15B
•
Updated
Jun 16
•
5
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH
Text Generation
•
7B
•
Updated
Jun 16
•
19
sunblaze-ucb/Qwen2.5-3B-GRPO-MATH-1EPOCH
Text Generation
•
3B
•
Updated
Jun 2
•
111
sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH
Text Generation
•
2B
•
Updated
Jun 2
•
5
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH-SYSP
Text Generation
•
7B
•
Updated
Jun 24
•
7
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH-SYSP
Text Generation
•
7B
•
Updated
Jun 24
•
25
sunblaze-ucb/Llama-3.2-3B-Instruct-Intuitor-MATH-1EPOCH
Text Generation
•
4B
•
Updated
Jun 25
•
23
sunblaze-ucb/Llama-3.2-3B-Instruct-GRPO-MATH-1EPOCH
Text Generation
•
4B
•
Updated
Jun 25
•
19
Upvote
-
Share collection
View history
Collection guide
Browse collections