Match Coma 7B

The theprint/MathTutor-7B model further finetuned on natural reasoning using GRPO. This is an experimental model and likely to hallucinate.

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for theprint/Math-Coma-7B

Base model

Finetuned

(1)

this model

Quantizations