Qwen3-4B-baseline-iter_0000952
This is a model uploaded from /mnt/nanjingcephfs/project_wx-rec-alg-bdc-exp/bwzheng/yulan/hyw/Ubiquant-Pretrain/build/wjp-share/output_mcore_qwen3_formal_pretrain/checkpoint/2025.10.15-22.06.42-pretrain-mcore-qwen3-moe-megatron-4B-lr-1e-5-minlr-1e-6-bs-1-gbs-512-seqlen-8192-pr-bf16-tp-1-pp-1-cp-1-ac-sel-do-true-sp-false-ti-2384-wi-238/iter_0000952/hf_ckpt.
- Downloads last month
- 47
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support