Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
sunblaze-ucb
's Collections
Intuitor
Intuitor
updated
Jun 25
Models in the paper "Learning to Reason without External Rewards"
Upvote
-
sunblaze-ucb/Qwen2.5-3B-Intuitor-MATH-1EPOCH
Text Generation
•
3B
•
Updated
27 days ago
•
151
•
1
sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH
Text Generation
•
2B
•
Updated
27 days ago
•
79
sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH
Text Generation
•
15B
•
Updated
27 days ago
•
517
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH
Text Generation
•
7B
•
Updated
27 days ago
•
27
sunblaze-ucb/Qwen3-14B-GRPO-MATH-1EPOCH
Text Generation
•
15B
•
Updated
27 days ago
•
20
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH
Text Generation
•
7B
•
Updated
27 days ago
•
22
sunblaze-ucb/Qwen2.5-3B-GRPO-MATH-1EPOCH
Text Generation
•
3B
•
Updated
27 days ago
•
12
sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH
Text Generation
•
2B
•
Updated
27 days ago
•
39
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH-SYSP
Text Generation
•
7B
•
Updated
Jun 24
•
23
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH-SYSP
Text Generation
•
7B
•
Updated
Jun 24
•
9
sunblaze-ucb/Llama-3.2-3B-Instruct-Intuitor-MATH-1EPOCH
Text Generation
•
4B
•
Updated
Jun 25
•
14
sunblaze-ucb/Llama-3.2-3B-Instruct-GRPO-MATH-1EPOCH
Text Generation
•
4B
•
Updated
Jun 25
•
12
Upvote
-
Share collection
View history
Collection guide
Browse collections