Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
INSAIT-Institute 's Collections
SPEAR
ReVLA: Reverting Visual Domain Limitation of Robotic Models
MamayLM-v1.0-Gemma-3
BrokenMath
MixAT
Open Proof Corpus
MamayLM-Gemma-2
BgGPT-Gemma-2

BrokenMath

updated Oct 10

The first benchmark for evaluating LLM sycophancy in mathematical reasoning.

Upvote
-

  • INSAIT-Institute/BrokenMath

    Viewer • Updated Oct 7 • 15.4k • 43 • 1

  • INSAIT-Institute/BrokenMath-Qwen3-4B

    4B • Updated Oct 7 • 4

  • BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs

    Paper • 2510.04721 • Published Oct 6
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs