Josephgflowers/DeepSeek-R1-Distill-Qwen-1.5B-LIMO

Model Overview

DeepSeek-R1-Distill-Qwen-1.5B-LIMO is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, trained on the LIMO (Less Is More for Reasoning) dataset. This fine-tune focuses on improving mathematical reasoning and problem-solving capabilities while maintaining efficient parameter scaling. The LIMO dataset, containing only 817 high-quality reasoning samples, challenges conventional scaling laws by demonstrating strong performance with minimal data.

Model Name: Josephgflowers/DeepSeek-R1-Distill-Qwen-1.5B-LIMO

Key Features

Mathematical Reasoning Focus: Trained on the LIMO dataset to enhance step-by-step logical deduction and problem-solving.
Optimized for Efficiency: Based on DeepSeek-R1-Distill-Qwen-1.5B, a smaller distilled model that retains powerful reasoning capabilities from its larger counterparts.
Chain-of-Thought (CoT) Emphasis: Encourages structured, step-by-step reasoning with improved interpretability.
Minimal Data, Strong Generalization: Leverages only 817 high-quality training samples to achieve competitive reasoning performance.

Model Details

Base Model: DeepSeek-R1-Distill-Qwen-1.5B
Parameter Count: 1.5B
Training Framework: Unsloth / Hugging Face Transformers
Dataset: LIMO (Less Is More for Reasoning)
Primary Use Cases:
- Mathematical and logical reasoning
- STEM education and tutoring
- Instruction-following for structured problem-solving

Training Data

This model was fine-tuned using the LIMO dataset, which emphasizes reasoning efficiency by achieving competitive performance with only 817 training samples.

Dataset Highlights:

Name: LIMO (Less Is More for Reasoning)
Size: 817 samples
Focus: Chain-of-thought reasoning for structured problem-solving
Key Motivation:
- High-quality, curated reasoning samples outperform large-scale noisy data.
- Designed to enhance reasoning generalization in a minimal-data setting.

Known Issues & Limitations

System Instructions: The model may require explicit user instructions to enforce structured reasoning.
Generalization Scope: While the LIMO dataset is effective for reasoning, additional fine-tuning may be required for broader NLP applications.
Computation Constraints: As a 1.5B parameter model, it is more lightweight than larger reasoning models but still requires moderate computational resources.

@misc{ye2025limoreasoning, title={LIMO: Less is More for Reasoning}, author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu}, year={2025}, eprint={2502.03387}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2502.03387}, }

@misc{deepseekai2025deepseekr1incentivizingreasoningcapability, title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning}, author={DeepSeek-AI and multiple contributors}, year={2025}, eprint={2501.12948}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.12948}, }

Josephgflowers
/

DeepSeek-R1-Distill-Qwen-1.5B-LIMO