Qwen3-4B-baseline-iter_0000952

This is a model uploaded from /mnt/nanjingcephfs/project_wx-rec-alg-bdc-exp/bwzheng/yulan/hyw/Ubiquant-Pretrain/build/wjp-share/output_mcore_qwen3_formal_pretrain/checkpoint/2025.10.15-22.06.42-pretrain-mcore-qwen3-moe-megatron-4B-lr-1e-5-minlr-1e-6-bs-1-gbs-512-seqlen-8192-pr-bf16-tp-1-pp-1-cp-1-ac-sel-do-true-sp-false-ti-2384-wi-238/iter_0000952/hf_ckpt.

Downloads last month
47
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support