LichengLiu03
/

Qwen2.5-3B-UFO-1turn

Text Generation

Model card Files Files and versions Community

LichengLiu03 commited on Jul 3

Commit

91662d5

·

verified ·

1 Parent(s): 5fd7862

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -16,6 +16,8 @@ pipeline_tag: text-generation
 This model is based on **Qwen2.5-3B-Instruct** and trained with **PPO (Proximal Policy Optimization)** on the **MetaMathQA** dataset for mathematical reasoning.
 ## Model Info
 - **Base model**: Qwen/Qwen2.5-3B-Instruct

 This model is based on **Qwen2.5-3B-Instruct** and trained with **PPO (Proximal Policy Optimization)** on the **MetaMathQA** dataset for mathematical reasoning.
+Github: https://github.com/lichengliu03/unary-feedback
 ## Model Info
 - **Base model**: Qwen/Qwen2.5-3B-Instruct