BasedBase commited on
Commit
8584bb8
·
verified ·
1 Parent(s): e6518a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -16,7 +16,7 @@ tags:
16
 
17
  ## Model Description
18
 
19
- This model is a distilled version of **`Qwen/Qwen3-30B-A3B-Instruct`** designed to inherit the reasoning and behavioral characteristics of its much larger teacher model, **`deepseek-ai/DeepSeek-V3.1`**.
20
 
21
  It is the result of applying a LoRA created via an SVD-based distillation pipeline, and then merging those weights into the base model. The core of this process was to transfer the nuanced knowledge from a **62-layer, 256-expert teacher model** into the more efficient **48-layer, 128-expert architecture** of the student model.
22
 
 
16
 
17
  ## Model Description
18
 
19
+ This model is a distilled version of **`Qwen/Qwen3-30B-A3B-Thinking`** designed to inherit the reasoning and behavioral characteristics of its much larger teacher model, **`deepseek-ai/DeepSeek-V3.1`**.
20
 
21
  It is the result of applying a LoRA created via an SVD-based distillation pipeline, and then merging those weights into the base model. The core of this process was to transfer the nuanced knowledge from a **62-layer, 256-expert teacher model** into the more efficient **48-layer, 128-expert architecture** of the student model.
22