Text Generation
Transformers
Safetensors
Russian
qwen2
conversational
text-generation-inference
attn-signs commited on
Commit
7b29424
·
verified ·
1 Parent(s): 3d3338c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -49,17 +49,17 @@ SFT LoRA обучение было выполнено на **двух NVIDIA A10
49
  - **Language(s) (NLP):** [RU/EN]
50
  - **Finetuned from model:** [Qwen2.5]
51
 
52
- **Distributed training:**
53
- - DeepSpeed (Stage 3)
54
- - HuggingFace Accelerator
55
-
56
- **Fusion:**
57
  - Flash Attention 2
58
  - Fused AdamW
59
- - Liger Kernel (swiglu, fused linear xentropy)
60
- -
61
- **GPU hours**: ~384 of NVIDIA A100
62
-
63
  ### Training configuration / Конфигурация обучения
64
  **The model was trained using MyLLM framework:**
65
  --== [MyLLM](https://github.com/Raumberg/myllm) ==--
 
49
  - **Language(s) (NLP):** [RU/EN]
50
  - **Finetuned from model:** [Qwen2.5]
51
 
52
+ **Distributed training:**
53
+ - DeepSpeed (Stage 3)
54
+ - HuggingFace Accelerator
55
+
56
+ **Fusion:**
57
  - Flash Attention 2
58
  - Fused AdamW
59
+ - Liger Kernel (swiglu, fused linear xentropy)
60
+
61
+ **GPU hours**: ~384 of NVIDIA A100
62
+
63
  ### Training configuration / Конфигурация обучения
64
  **The model was trained using MyLLM framework:**
65
  --== [MyLLM](https://github.com/Raumberg/myllm) ==--