--- library_name: transformers license: mit datasets: - eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1 language: - en new_version: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B pipeline_tag: question-answering --- # Model Card for OpenAI GSM8K Dataset Enhanced with Reasoning This model is fine-tuned to answer questions based on the OpenAI GSM8K dataset enhanced with reasoning provided from Deepseek R1. Invoke notebook shared [here](https://colab.research.google.com/drive/1B_Fbz0w76QxHbo9zAOf_pyZKKNI0EJJ9?usp=sharing), a publicly available Colab notebook for tests. --- ## Model Details ### Model Description This is a transformer-based question-answering model fine-tuned from `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`. It was trained on a dataset derived from the OpenAI GSM8K benchmark, enhanced with chain-of-thought reasoning to encourage intermediate logical steps. The dataset pairs math word problems with structured answers, using `...` and `...` tags. - **Developed by:** Yiqiao Yin - **Model type:** Causal Language Model (fine-tuned for Q&A with reasoning) - **Language(s):** English - **License:** MIT - **Finetuned from model:** deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --- ## Training Configuration - 🖥️ **Hardware:** Trained on a RunPod instance with: - 🔥 6 × NVIDIA H100 PCIe GPUs - 🧠 144 vCPUs - 🧮 1132 GB system RAM - 💽 20 GB disk per GPU - 🐳 **Container Image:** `runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04` - ⏱️ **Total Training Time:** 2 hours - 💸 **Cost:** ~$14/hour × 2 hours = **$28 USD** - ⚙️ **Zero Redundancy Optimization:** DeepSpeed Stage 1 - 🎯 **Precision:** FP16 mixed-precision training --- ## Performance - **Mean token-level accuracy:** **97%** - Evaluation based on in-training token match accuracy over the formatted `......` structure. - Model demonstrates strong reasoning capability in multi-step arithmetic and logic problems. --- ## Inference Format To generate accurate completions, prompt the model in the following structure: ``` Question: If Sally has 3 apples and buys 2 more, how many does she have in total? ``` Be aware that this token `` will prompt the answer to start with `` which is trained into the model based on training data. The model will continue reasoning within `...` and provide a final answer inside `...`. --- ## Intended Use This model is intended for educational and research purposes in: - Chain-of-thought prompting - Math reasoning and logical inference - Question-answering with intermediate steps --- ## Limitations - Trained on structured synthetic data — real-world generalization may vary - Best performance achieved when following the exact inference format - Does not support multilingual inputs --- ## Citation If you use this model, please cite: ``` @misc{yin2024gsm8k, author = {Yiqiao Yin}, title = {TBD}, year = 2025, note = {TBD} } ``` ## Model Card Contact Author: Yiqiao Yin Connect with me on [LinkedIn](https://www.linkedin.com/in/yiqiaoyin/)