RESMP-DEV
/

Accessible_Qwen_4B

Model card Files Files and versions Community

Kearm commited on Jun 1

Commit

a4787ce

·

verified ·

1 Parent(s): e0eed5f

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -10,4 +10,6 @@ datasets:
 <!-- Provide a quick summary of what the model is/does. -->
-Acc Qwen 4B is a state of the art accessibility GRPO RL trained model with RM_R1 style Chain of Rubric distsillation of Claude 4 Opus using Gemini 2.5 Flash to Qwen 3 4B over 18 million tokens.

 <!-- Provide a quick summary of what the model is/does. -->
+Acc Qwen 4B is a state of the art accessibility GRPO RL trained model with RM_R1 style Chain of Rubric distsillation of Claude 4 Opus using Gemini 2.5 Flash to Qwen 3 4B over 18 million tokens.
+The code for training the model is at https://github.com/Nottlespike/Accessible_Qwen