Update README.md
Browse files
README.md
CHANGED
@@ -15,8 +15,7 @@ the code to generate this model is here: [stage1_v2.py](https://huggingface.co/c
|
|
15 |
|
16 |
This model was made possible by excellent AMD mi300x compute generously provided by [Hot Aisle](https://hotaisle.xyz/).
|
17 |
|
18 |
-
|
19 |
-
**⚠️ Note: This is an intermediate checkpoint not intended for direct use. For the complete model, use Qwen3-72B-Embiggened.**
|
20 |
|
21 |
## Architecture Changes
|
22 |
|
|
|
15 |
|
16 |
This model was made possible by excellent AMD mi300x compute generously provided by [Hot Aisle](https://hotaisle.xyz/).
|
17 |
|
18 |
+
As is, this model underperforms Qwen3-32B. The intent is to create a target suitable for distillation from Qwen3-235B.
|
|
|
19 |
|
20 |
## Architecture Changes
|
21 |
|