hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_acc_seq_end_mask_thin_mu_8_warmed_math 2B • Updated 6 minutes ago
hdong0/deepseek-Qwen2.5-7B-baseline-thin-Open-R1-GRPO_deepscaler_acc_mu_8_constant_lr_warmed_math 8B • Updated 43 minutes ago • 2
hdong0/deepseek-Qwen2.5-7B-baseline-thin-Open-R1-GRPO_deepscaler_acc_mu_8_constant_lr_warmed_rerun 8B • Updated about 1 hour ago • 2
hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_acc_seq_end_mask_thin_mu_8_warmed_math 2B • Updated 6 minutes ago
hdong0/deepseek-Qwen2.5-7B-baseline-thin-Open-R1-GRPO_deepscaler_acc_mu_8_constant_lr_warmed_math 8B • Updated 43 minutes ago • 2
hdong0/deepseek-Qwen2.5-7B-baseline-thin-Open-R1-GRPO_deepscaler_acc_mu_8_constant_lr_warmed_rerun 8B • Updated about 1 hour ago • 2
hdong0/deepseek-Qwen2.5-7B-baseline-thin-Open-R1-GRPO_deepscaler_acc_mu_8_constant_lr_warmed Text Generation • 8B • Updated about 14 hours ago • 49
hdong0/deepseek-Qwen-7B-batch-mix-GRPO_deepscaler_seq_end_mask_thin_mu_8_warmed_math Text Generation • 8B • Updated 2 days ago • 35
hdong0/deepseek-Qwen-7B-batch-mix-GRPO_deepscaler_seq_end_mask_thin_mu_8_warmed_math Text Generation • 8B • Updated 2 days ago • 35
hdong0/deepseek-Qwen-7B-batch-mix-GRPO_deepscaler_seq_end_mask_thin_mu_8_warmed_math Text Generation • 8B • Updated 2 days ago • 35
hdong0/deepseek-Qwen2.5-7B-baseline-thin-Open-R1-GRPO_deepscaler_acc_mu_8_constant_lr_warmed Text Generation • 8B • Updated about 14 hours ago • 49
hdong0/deepseek-Qwen2.5-7B-baseline-thin-Open-R1-GRPO_deepscaler_acc_mu_8_constant_lr_warmed Text Generation • 8B • Updated about 14 hours ago • 49
hdong0/Qwen-Math-7B-batch-mix-GRPO_deepscaler_seq_end_mask_thin_mu_8_warmed 8B • Updated 4 days ago • 5