Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
sunblaze-ucb 's Collections
Intuitor

Intuitor

updated Jun 25

Models in the paper "Learning to Reason without External Rewards"

Upvote
-

  • sunblaze-ucb/Qwen2.5-3B-Intuitor-MATH-1EPOCH

    Text Generation • 3B • Updated 27 days ago • 151 • 1

  • sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH

    Text Generation • 2B • Updated 27 days ago • 79

  • sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH

    Text Generation • 15B • Updated 27 days ago • 517

  • sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH

    Text Generation • 7B • Updated 27 days ago • 27

  • sunblaze-ucb/Qwen3-14B-GRPO-MATH-1EPOCH

    Text Generation • 15B • Updated 27 days ago • 20

  • sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH

    Text Generation • 7B • Updated 27 days ago • 22

  • sunblaze-ucb/Qwen2.5-3B-GRPO-MATH-1EPOCH

    Text Generation • 3B • Updated 27 days ago • 12

  • sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH

    Text Generation • 2B • Updated 27 days ago • 39

  • sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH-SYSP

    Text Generation • 7B • Updated Jun 24 • 23

  • sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH-SYSP

    Text Generation • 7B • Updated Jun 24 • 9

  • sunblaze-ucb/Llama-3.2-3B-Instruct-Intuitor-MATH-1EPOCH

    Text Generation • 4B • Updated Jun 25 • 14

  • sunblaze-ucb/Llama-3.2-3B-Instruct-GRPO-MATH-1EPOCH

    Text Generation • 4B • Updated Jun 25 • 12
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs