Text Generation
Safetensors
qwen2
conversational
quyanh commited on
Commit
44293bb
·
verified ·
1 Parent(s): c23d13f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -45,12 +45,14 @@ Our approach uses 7,000 samples (42,000 total outputs) and costs ~$42 on 4x A40
45
  ## Citation
46
  If this project aids your work, please cite it as:
47
  ```
48
- @misc{open-rs,
49
- title = {Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't},
50
- url = {https://github.com/knoveleng/open-rs},
51
- author = {Quy-Anh Dang, Chris Ngo},
52
- month = {March},
53
- year = {2025}
 
 
54
  }
55
  ```
56
 
 
45
  ## Citation
46
  If this project aids your work, please cite it as:
47
  ```
48
+ @misc{dang2025reinforcementlearningreasoningsmall,
49
+ title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't},
50
+ author={Quy-Anh Dang and Chris Ngo},
51
+ year={2025},
52
+ eprint={2503.16219},
53
+ archivePrefix={arXiv},
54
+ primaryClass={cs.LG},
55
+ url={https://arxiv.org/abs/2503.16219},
56
  }
57
  ```
58