wangrongsheng
/

Med-R1-LoRA-checkpoints

Model card Files Files and versions

wangrongsheng commited on Apr 24

Commit

fa12345

·

verified ·

1 Parent(s): 972634e

Update README.md

Files changed (1) hide show

README.md +12 -1

README.md CHANGED Viewed

@@ -20,4 +20,15 @@ Med-R1 is dedicated to translating the success of RL in the training of LLMs wit
 - 🔺 Achieves SOTA with only limited training samples. A model with only *14B parameters* outperforms [Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), and [Claude-3.5-Sonnet](https://www.anthropic.com/news/claude-3-5-sonnet) on various medical tasks.
 - 🌟 Shows strong generalization across diverse problem types. It excels not only in medical tasks but also in mathematics and various other challenging problems (such as [Ruozhiba](https://huggingface.co/datasets/m-a-p/COIG-CQIA)).
-- 📚 Release open-source models and various tools, such as training datasets, gradio demo, etc.

 - 🔺 Achieves SOTA with only limited training samples. A model with only *14B parameters* outperforms [Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), and [Claude-3.5-Sonnet](https://www.anthropic.com/news/claude-3-5-sonnet) on various medical tasks.
 - 🌟 Shows strong generalization across diverse problem types. It excels not only in medical tasks but also in mathematics and various other challenging problems (such as [Ruozhiba](https://huggingface.co/datasets/m-a-p/COIG-CQIA)).
+- 📚 Release open-source models and various tools, such as training datasets, gradio demo, etc.
+# Citation
+```bibtex
+@Misc{med-r1,
+  title = {Med-R1: Encourage Medical LLM to engage in deep thinking similar to DeepSeek-R1},
+  author = {Rongsheng Wang},
+  howpublished = {\url{https://github.com/WangRongsheng/Med-R1}},
+  year = {2025}
+}
+```