Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ This is a SIGNIFICANTLY cool outcome. I widened Qwen3-32B. And it's still perf
|
|
11 |
|
12 |
This is an intermediate checkpoint in the process of expanding Qwen3-32B to match Qwen3-72B architecture dimensions. This model represents Stage 1 of a two-stage upscaling process, where the hidden dimensions and attention heads have been expanded, but the model still maintains 64 layers.
|
13 |
|
14 |
-
the code is here
|
15 |
|
16 |
**⚠️ Note: This is an intermediate checkpoint not intended for direct use. For the complete model, use Qwen3-72B-Embiggened.**
|
17 |
|
|
|
11 |
|
12 |
This is an intermediate checkpoint in the process of expanding Qwen3-32B to match Qwen3-72B architecture dimensions. This model represents Stage 1 of a two-stage upscaling process, where the hidden dimensions and attention heads have been expanded, but the model still maintains 64 layers.
|
13 |
|
14 |
+
the code is [here](stage1_v2.py)
|
15 |
|
16 |
**⚠️ Note: This is an intermediate checkpoint not intended for direct use. For the complete model, use Qwen3-72B-Embiggened.**
|
17 |
|