arXiv:2511.06221

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Published on Nov 9

· Submitted by

DenseHub on Nov 12

#2 Paper of the day

WeiboAI

Upvote

Authors:

Sen Xu ,

Yi Zhou ,

Jixin Min ,

Zhibin Yin ,

Lianyu Pang ,

Junlin Zhang

Abstract

VibeThinker-1.5B, a 1.5B-parameter model using the Spectrum-to-Signal Principle, achieves superior reasoning capabilities compared to larger models at a significantly lower cost.

AI-generated summary

Challenging the prevailing consensus that small models inherently lack robust reasoning, this report introduces VibeThinker-1.5B, a 1.5B-parameter dense model developed via our Spectrum-to-Signal Principle (SSP). This challenges the prevailing approach of scaling model parameters to enhance capabilities, as seen in models like DeepSeek R1 (671B) and Kimi k2 (>1T). The SSP framework first employs a Two-Stage Diversity-Exploring Distillation (SFT) to generate a broad spectrum of solutions, followed by MaxEnt-Guided Policy Optimization (RL) to amplify the correct signal. With a total training cost of only $7,800, VibeThinker-1.5B demonstrates superior reasoning capabilities compared to closed-source models like Magistral Medium and Claude Opus 4, and performs on par with open-source models like GPT OSS-20B Medium. Remarkably, it surpasses the 400x larger DeepSeek R1 on three math benchmarks: AIME24 (80.3 vs. 79.8), AIME25 (74.4 vs. 70.0), and HMMT25 (50.4 vs. 41.7). This is a substantial improvement over its base model (6.7, 4.3, and 0.6, respectively). On LiveCodeBench V6, it scores 51.1, outperforming Magistral Medium's 50.3 and its base model's 0.0. These findings demonstrate that small models can achieve reasoning capabilities comparable to large models, drastically reducing training and inference costs and thereby democratizing advanced AI research.

View arXiv page View PDF Project page GitHub 58 Add to collection

Community

YinZhiBin

Paper author 2 days ago

•

edited 2 days ago

Through the innovative Spectrum-to-Signal Principle (SSP) training methodology, the 1.5B-parameter VibeThinker-1.5B surpasses giant models hundreds of times larger across multiple reasoning benchmarks, demonstrating at an extremely low cost that small models can also achieve top-tier reasoning capabilities.

GitHub https://github.com/WeiboAI/VibeThinker

YiZhouDenseHub

Paper author Paper submitter about 20 hours ago

An extreme test that if 1.5B model can achieve strong reasoning ability

YiZhouDenseHub

Paper author Paper submitter about 20 hours ago

A simple evaluation (Still recommend you to test this model with competitive math / python algorithm tasks)

Catlilface

about 10 hours ago

There is no point of testing LLMs, even math-specific ones with such tasks. LLMs won't and are not supposed to be used like this. Please stop testing them like this, use calculator or tool instead.
VibeThinker is an astonishing model. I've already tested it on writing some algorithms and, despite it's size, it handles it very well. Though, code optimization problem is still unsolvable in any meaningful way.