BasedBase commited on
Commit
97a531b
·
verified ·
1 Parent(s): c0da93f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -14,7 +14,7 @@ tags:
14
 
15
  ## Model Description
16
 
17
- This model was created by distilling the knowledge Qwen3-Coder-480B Mixture-of-Experts (MoE) teacher model into the compact and efficient **Tesslate/WEBGEN-4B-Preview** base.
18
 
19
  The purpose of this distill is to make the Webgen-4B-Preview model gain some of the knowledge of a large MoE model to improve its overall performance. This model should perform better for web design but it is still a 4B model
20
  **It is reccomended to use bf16 as its still only 8gb and because small models are very sensitive to quantization. For optimal results be specific in your prompting and avoid vaugue ambiguous prompts like "Create a website for a taco restaurant". Instead use prompts like "Make a single-file landing page for "RasterFlow" (GPU video pipeline).
 
14
 
15
  ## Model Description
16
 
17
+ This model was created by distilling the Qwen3-Coder-480B Mixture-of-Experts (MoE) teacher model into the compact and efficient **Tesslate/WEBGEN-4B-Preview** base.
18
 
19
  The purpose of this distill is to make the Webgen-4B-Preview model gain some of the knowledge of a large MoE model to improve its overall performance. This model should perform better for web design but it is still a 4B model
20
  **It is reccomended to use bf16 as its still only 8gb and because small models are very sensitive to quantization. For optimal results be specific in your prompting and avoid vaugue ambiguous prompts like "Create a website for a taco restaurant". Instead use prompts like "Make a single-file landing page for "RasterFlow" (GPU video pipeline).