DeepSeek-R1-Distill-Qwen-0.5B-CoMa

This model is the distilled version of DeepSeek-R1 on Qwen-CoMa-0.5b with R1-Distill-SFT dataset.

Built with Axolotl

See axolotl config
base_model: someon98/qwen-CoMa-0.5b
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: true
load_in_4bit: false
strict: false

chat_template: qwen_25
datasets:
  - path: /kaggle/working/custom_dataset.json
    type: chat_template
    conversation: chatml
    ds_type: json

add_bos_token: true
add_eos_token: true
use_default_system_prompt: false

adapter: lora
lora_model_dir:
lora_r: 16
lora_alpha: 32
lora_dropout: 0.1
lora_target_linear: true

sequence_len: 2048
sample_packing: false
pad_to_sequence_len: true
micro_batch_size: 2
gradient_accumulation_steps: 8
num_epochs: 1
learning_rate: 2e-5
optimizer: paged_adamw_8bit
lr_scheduler: cosine

train_on_inputs: false
group_by_length: false
bf16: false
fp16: true
tf32: false

gradient_checkpointing: true
flash_attention: false

logging_steps: 50
warmup_steps: 100
saves_per_epoch: 1

output_dir: ./qwen-sft-results
save_safetensors: true

Example usage:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "suayptalha/DeepSeek-R1-Distill-Qwen-0.5B-CoMa",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Qwen-0.5B-CoMa")

SYSTEM_PROMPT = """Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.
"""

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "Which one is larger? 9.11 or 9.9?"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True,
    return_tensors = "pt",
).to("cuda")
output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
print(decoded_output)

Output:

<think>
First, I need to compare the two numbers 9.11 and 9.9. 

Next, I'll analyze each number. The first digit after the decimal point in 9.11 is 1, and in 9.9, it's 9. 

Since 9 is greater than 1, 9.9 is larger than 9.11.
</think>

To determine which number is larger, let's compare the two numbers:

**9.11** and **9.9**

1. **Identify the Decimal Places:**
   - Both numbers have two decimal places.
   
2. **Compare the Tens Place (Right of the Decimal Point):**
   - **9.11:** The tens place is 1.
   - **9.9:** The tens place is 9.
   
3. **Conclusion:**
   - Since 9 is greater than 1, the number with the larger tens place is 9.9.
   
**Answer:** **9.9** is larger than **9.11**.

Suggested system prompt:

Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.

Parameters

  • lr: 2e-5
  • epochs: 1
  • batch_size: 16
  • optimizer: paged_adamw_8bit
Downloads last month
342
Safetensors
Model size
494M params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for AquaLabs/DeepSeek-R1-Distill-Qwen-0.5B-CoMa

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(1)
this model
Quantizations
1 model

Dataset used to train AquaLabs/DeepSeek-R1-Distill-Qwen-0.5B-CoMa