LichengLiu03/Qwen2.5-3B-UFO · Improve model card: Add detailed framework and results sections

24 days ago

This PR enhances the model card by integrating more comprehensive details directly from the project's GitHub repository.

Key additions include:

A detailed "Overview" section explaining the problem, the UFO framework solution, and its impact.
A dedicated "UFO Framework Details" section outlining the problem formulation, how Unary Feedback as Observation (UFO) works, the training approach with PPO, and the reward design strategies.
A "Key Results" section presenting the multi-turn reasoning performance, effectiveness of unary feedback, and the impact of reward design, complete with supporting figures from the paper's project page.
An "Acknowledgements" section to properly credit contributing teams.

These additions significantly enrich the model card, providing a more thorough understanding of the model's capabilities, training methodology, and empirical performance.

Improve model card: Add detailed framework and results sections18440142

LichengLiu03 changed pull request status to merged 24 days ago