attn-signs
/

Watari-32b-v2

Text Generation

text-generation-inference

Model card Files Files and versions Community

attn-signs commited on 29 days ago

Commit

7b29424

·

verified ·

1 Parent(s): 3d3338c

Update README.md

Files changed (1) hide show

README.md +9 -9

README.md CHANGED Viewed

@@ -49,17 +49,17 @@ SFT LoRA обучение было выполнено на **двух NVIDIA A10
 - **Language(s) (NLP):** [RU/EN]
 - **Finetuned from model:** [Qwen2.5]
-**Distributed training:**
-- DeepSpeed (Stage 3)
-- HuggingFace Accelerator
-**Fusion:**
 - Flash Attention 2
 - Fused AdamW
-- Liger Kernel (swiglu, fused linear xentropy)
--
-**GPU hours**: ~384 of NVIDIA A100
 ### Training configuration / Конфигурация обучения
 **The model was trained using MyLLM framework:**
 --== [MyLLM](https://github.com/Raumberg/myllm) ==--

 - **Language(s) (NLP):** [RU/EN]
 - **Finetuned from model:** [Qwen2.5]
+**Distributed training:**
+- DeepSpeed (Stage 3)
+- HuggingFace Accelerator
+**Fusion:**
 - Flash Attention 2
 - Fused AdamW
+- Liger Kernel (swiglu, fused linear xentropy)
+**GPU hours**: ~384 of NVIDIA A100
 ### Training configuration / Конфигурация обучения
 **The model was trained using MyLLM framework:**
 --== [MyLLM](https://github.com/Raumberg/myllm) ==--