Update README.md
Browse files
README.md
CHANGED
@@ -20,4 +20,15 @@ Med-R1 is dedicated to translating the success of RL in the training of LLMs wit
|
|
20 |
|
21 |
- πΊ Achieves SOTA with only limited training samples. A model with only *14B parameters* outperforms [Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), and [Claude-3.5-Sonnet](https://www.anthropic.com/news/claude-3-5-sonnet) on various medical tasks.
|
22 |
- π Shows strong generalization across diverse problem types. It excels not only in medical tasks but also in mathematics and various other challenging problems (such as [Ruozhiba](https://huggingface.co/datasets/m-a-p/COIG-CQIA)).
|
23 |
-
- π Release open-source models and various tools, such as training datasets, gradio demo, etc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
|
21 |
- πΊ Achieves SOTA with only limited training samples. A model with only *14B parameters* outperforms [Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), and [Claude-3.5-Sonnet](https://www.anthropic.com/news/claude-3-5-sonnet) on various medical tasks.
|
22 |
- π Shows strong generalization across diverse problem types. It excels not only in medical tasks but also in mathematics and various other challenging problems (such as [Ruozhiba](https://huggingface.co/datasets/m-a-p/COIG-CQIA)).
|
23 |
+
- π Release open-source models and various tools, such as training datasets, gradio demo, etc.
|
24 |
+
|
25 |
+
# Citation
|
26 |
+
|
27 |
+
```bibtex
|
28 |
+
@Misc{med-r1,
|
29 |
+
title = {Med-R1: Encourage Medical LLM to engage in deep thinking similar to DeepSeek-R1},
|
30 |
+
author = {Rongsheng Wang},
|
31 |
+
howpublished = {\url{https://github.com/WangRongsheng/Med-R1}},
|
32 |
+
year = {2025}
|
33 |
+
}
|
34 |
+
```
|