Update README.md
Browse files
README.md
CHANGED
@@ -49,17 +49,17 @@ SFT LoRA обучение было выполнено на **двух NVIDIA A10
|
|
49 |
- **Language(s) (NLP):** [RU/EN]
|
50 |
- **Finetuned from model:** [Qwen2.5]
|
51 |
|
52 |
-
**Distributed training:**
|
53 |
-
- DeepSpeed (Stage 3)
|
54 |
-
- HuggingFace Accelerator
|
55 |
-
|
56 |
-
**Fusion:**
|
57 |
- Flash Attention 2
|
58 |
- Fused AdamW
|
59 |
-
- Liger Kernel (swiglu, fused linear xentropy)
|
60 |
-
|
61 |
-
**GPU hours**: ~384 of NVIDIA A100
|
62 |
-
|
63 |
### Training configuration / Конфигурация обучения
|
64 |
**The model was trained using MyLLM framework:**
|
65 |
--== [MyLLM](https://github.com/Raumberg/myllm) ==--
|
|
|
49 |
- **Language(s) (NLP):** [RU/EN]
|
50 |
- **Finetuned from model:** [Qwen2.5]
|
51 |
|
52 |
+
**Distributed training:**
|
53 |
+
- DeepSpeed (Stage 3)
|
54 |
+
- HuggingFace Accelerator
|
55 |
+
|
56 |
+
**Fusion:**
|
57 |
- Flash Attention 2
|
58 |
- Fused AdamW
|
59 |
+
- Liger Kernel (swiglu, fused linear xentropy)
|
60 |
+
|
61 |
+
**GPU hours**: ~384 of NVIDIA A100
|
62 |
+
|
63 |
### Training configuration / Конфигурация обучения
|
64 |
**The model was trained using MyLLM framework:**
|
65 |
--== [MyLLM](https://github.com/Raumberg/myllm) ==--
|