knoveleng
/

OpenRS-GRPO

Text Generation

Model card Files Files and versions Community

quyanh commited on 8 days ago

Commit

44293bb

·

verified ·

1 Parent(s): c23d13f

Update README.md

Files changed (1) hide show

README.md +8 -6

README.md CHANGED Viewed

@@ -45,12 +45,14 @@ Our approach uses 7,000 samples (42,000 total outputs) and costs ~$42 on 4x A40
 ## Citation
 If this project aids your work, please cite it as:
 ```
-@misc{open-rs,
-  title = {Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't},
-  url = {https://github.com/knoveleng/open-rs},
-  author = {Quy-Anh Dang, Chris Ngo},
-  month = {March},
-  year = {2025}
 }
 ```

 ## Citation
 If this project aids your work, please cite it as:
 ```
+@misc{dang2025reinforcementlearningreasoningsmall,
+      title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't},
+      author={Quy-Anh Dang and Chris Ngo},
+      year={2025},
+      eprint={2503.16219},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG},
+      url={https://arxiv.org/abs/2503.16219},
 }
 ```