Update README.md
Browse files
README.md
CHANGED
|
@@ -35,7 +35,7 @@ AstroLLaMA-2-70B-Chat_AIC is a specialized chat model for astronomy, developed b
|
|
| 35 |
- Warmup ratio: 0.03
|
| 36 |
- Cosine decay schedule for learning rate reduction
|
| 37 |
- **Primary Use**: Instruction-following and chat-based interactions for astronomy-related queries
|
| 38 |
-
- **Reference**: Pan et al. 2024
|
| 39 |
|
| 40 |
## Using the model for chat
|
| 41 |
|
|
@@ -84,6 +84,7 @@ While the AstroLLaMA-2-70B-Base_AIC model demonstrated significant improvements
|
|
| 84 |
|
| 85 |
| Model | Score (%) |
|
| 86 |
|-------|-----------|
|
|
|
|
| 87 |
| **<span style="color:green">AstroLLaMA-2-70B-Base (AstroMLab)</span>** | **<span style="color:green">76.0</span>** |
|
| 88 |
| LLaMA-3.1-8B | 73.7 |
|
| 89 |
| LLaMA-2-70B | 70.7 |
|
|
@@ -105,7 +106,7 @@ These limitations underscore the challenges in developing specialized chat model
|
|
| 105 |
|
| 106 |
This model is released primarily for reproducibility purposes, allowing researchers to track the development process and compare different iterations of AstroLLaMA models.
|
| 107 |
|
| 108 |
-
For optimal performance and the most up-to-date capabilities in astronomy-related tasks, we recommend using AstroSage-8B, where these limitations have been addressed through expanded training data and refined fine-tuning processes.
|
| 109 |
|
| 110 |
## Ethical Considerations
|
| 111 |
|
|
|
|
| 35 |
- Warmup ratio: 0.03
|
| 36 |
- Cosine decay schedule for learning rate reduction
|
| 37 |
- **Primary Use**: Instruction-following and chat-based interactions for astronomy-related queries
|
| 38 |
+
- **Reference**: [Pan et al. 2024](https://arxiv.org/abs/2409.19750)
|
| 39 |
|
| 40 |
## Using the model for chat
|
| 41 |
|
|
|
|
| 84 |
|
| 85 |
| Model | Score (%) |
|
| 86 |
|-------|-----------|
|
| 87 |
+
| **AstroSage-LLaMA-3.1-8B (AstroMLab)** | **80.9** |
|
| 88 |
| **<span style="color:green">AstroLLaMA-2-70B-Base (AstroMLab)</span>** | **<span style="color:green">76.0</span>** |
|
| 89 |
| LLaMA-3.1-8B | 73.7 |
|
| 90 |
| LLaMA-2-70B | 70.7 |
|
|
|
|
| 106 |
|
| 107 |
This model is released primarily for reproducibility purposes, allowing researchers to track the development process and compare different iterations of AstroLLaMA models.
|
| 108 |
|
| 109 |
+
For optimal performance and the most up-to-date capabilities in astronomy-related tasks, we recommend using AstroSage-LLaMA-3.1-8B, where these limitations have been addressed through expanded training data and refined fine-tuning processes.
|
| 110 |
|
| 111 |
## Ethical Considerations
|
| 112 |
|