Safetensors
qwen3
Kearm commited on
Commit
a4787ce
·
verified ·
1 Parent(s): e0eed5f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -10,4 +10,6 @@ datasets:
10
 
11
  <!-- Provide a quick summary of what the model is/does. -->
12
 
13
- Acc Qwen 4B is a state of the art accessibility GRPO RL trained model with RM_R1 style Chain of Rubric distsillation of Claude 4 Opus using Gemini 2.5 Flash to Qwen 3 4B over 18 million tokens.
 
 
 
10
 
11
  <!-- Provide a quick summary of what the model is/does. -->
12
 
13
+ Acc Qwen 4B is a state of the art accessibility GRPO RL trained model with RM_R1 style Chain of Rubric distsillation of Claude 4 Opus using Gemini 2.5 Flash to Qwen 3 4B over 18 million tokens.
14
+
15
+ The code for training the model is at https://github.com/Nottlespike/Accessible_Qwen