DeepSeek-R1-Distill-Qwen-7B-GRPO / train_results.json
Kadins's picture
Model save
a62814b verified
{
"total_flos": 0.0,
"train_loss": 0.028963628810559583,
"train_runtime": 59585.1579,
"train_samples": 10800,
"train_samples_per_second": 0.181,
"train_steps_per_second": 0.006
}