Josephgflowers
/

TinyLlama-R1-LIMO

Model card Files Files and versions Community

TinyLlama-R1-LIMO / README.md

Josephgflowers's picture

Update README.md

2102138 verified 9 days ago

|

history blame contribute delete

2.47 kB

	---
	license: mit
	datasets:
	- GAIR/LIMO
	base_model:
	- Josephgflowers/Tinyllama-STEM-Cinder-Agent-v1
	---

	## Model Overview
	TinyLlama-R1-LIMO is a small, efficient transformer-based model designed to improve mathematical reasoning with minimal but high-quality training data. It was fine-tuned on the LIMO dataset, which emphasizes the principle that "Less Is More" for reasoning tasks. The model is part of ongoing research to enhance instruction-following and reasoning capabilities using a dataset of only 817 curated samples.


	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6328952f798f8d122ce62a44/kW4PCQ387on6BErRz_Fnn.png)

	Model Name: `Josephgflowers/Tinyllama-R1-LIMO-Agent`

	This model was made possible by the generous support of www.cherryrepublic.com

	---

	## Key Features
	- Data Efficiency: Achieves competitive reasoning performance using the LIMO dataset with only 817 training samples.
	- Mathematical Reasoning Focus: Tailored for tasks requiring logical and numerical problem-solving.
	- Instruction Adaptation: The model shows improved chain-of-thought (CoT) reasoning but may require further refinement for handling complex, multi-step prompts.
	- Training Pipeline: Built using the LLaMA-Factory framework with dataset-specific optimizations.

	---

	## Model Details
	- Model Type: Transformer-based (TinyLlama architecture)
	- Parameter Count: 1.1B
	- Training Framework: Unsloth 8k context / Hugging Face Transformers
	- Primary Use Cases:
	- Mathematical and logical reasoning
	- STEM education and problem-solving
	- Instruction-following conversations

	---

	## Training Data
	This model was fine-tuned using the LIMO dataset, which emphasizes the power of high-quality data over quantity.

	### Dataset Highlights
	- Name: LIMO (Less Is More for Reasoning)
	- Size: 817 samples

	Acknowledgments

	Thanks to the creators of the LIMO dataset and contributors to the LLaMA-Factory training framework. Special thanks to Joseph Flowers for model fine-tuning and experimentation.
	Citation

	If you use this model or dataset, please cite the following paper:

	@misc{ye2025limoreasoning,
	title={LIMO: Less is More for Reasoning},
	author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu},
	year={2025},
	eprint={2502.03387},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2502.03387},
	}