Update README.md
Browse files
README.md
CHANGED
@@ -30,6 +30,11 @@ Context size: 32K + 8K for output (40k total)
|
|
30 |
|
31 |
Use Jinja Template or CHATML template.
|
32 |
|
|
|
|
|
|
|
|
|
|
|
33 |
Please refer the org model card for details, benchmarks, how to use, settings, system roles etc etc :
|
34 |
|
35 |
[ https://huggingface.co/Qwen/Qwen3-30B-A3B ]
|
|
|
30 |
|
31 |
Use Jinja Template or CHATML template.
|
32 |
|
33 |
+
IMPORTANT NOTES:
|
34 |
+
|
35 |
+
- Due to the unique nature (MOE, Size, Activated experts) of this model GGUF quants can be run on the CPU, GPU or with GPU part "off-load", right up to full precision.
|
36 |
+
- This model is difficult to Imatrix : You need a much larger imatrix file / multi-language / multi-content to imatrix it.
|
37 |
+
|
38 |
Please refer the org model card for details, benchmarks, how to use, settings, system roles etc etc :
|
39 |
|
40 |
[ https://huggingface.co/Qwen/Qwen3-30B-A3B ]
|