QuixiAI
/

Qwen3-58B-Embiggened

Model card Files Files and versions

ehartford commited on Jun 12

Commit

07f0267

·

verified ·

1 Parent(s): c01ce88

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -11,6 +11,8 @@ This is a SIGNIFICANTLY cool outcome.  I widened Qwen3-32B.  And it's still perf
 This is an intermediate checkpoint in the process of expanding Qwen3-32B to match Qwen3-72B architecture dimensions. This model represents Stage 1 of a two-stage upscaling process, where the hidden dimensions and attention heads have been expanded, but the model still maintains 64 layers.
 **⚠️ Note: This is an intermediate checkpoint not intended for direct use. For the complete model, use Qwen3-72B-Embiggened.**
 ## Architecture Changes

 This is an intermediate checkpoint in the process of expanding Qwen3-32B to match Qwen3-72B architecture dimensions. This model represents Stage 1 of a two-stage upscaling process, where the hidden dimensions and attention heads have been expanded, but the model still maintains 64 layers.
+the code is here: [stage1_v2.py]
 **⚠️ Note: This is an intermediate checkpoint not intended for direct use. For the complete model, use Qwen3-72B-Embiggened.**
 ## Architecture Changes