Model Card for Model ID
Model Details
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: [Abaryan]
- Funded by [optional]: [More Information Needed]
- **Shared by [Abaryan]
- Model type: [GRPO + CoT]
- Language(s) (NLP): [More Information Needed]
- License: [More Information Needed]
- Finetuned from model [Qwen_2.5_1.5b]: [More Information Needed]
Training Details
Training Data
[GSM8K]
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[bf16, no quantisation, no LoRA,Batch_size=5, num of generation = 5, 3000_steps]
Evaluation
Metrics
[More Information Needed]
Results
[More Information Needed]
Model Architecture and Objective
[Transformers]
Compute Infrastructure
[More Information Needed]
Hardware
[2x 4080s]
Software
[cuda_12.6 & pytorch_2.6]
BibTeX:
[More Information Needed]
APA:
- Downloads last month
- 11