attn-signs
/

Watari-32b-v2

Text Generation

text-generation-inference

Model card Files Files and versions Community

attn-signs commited on 29 days ago

Commit

393f594

·

verified ·

1 Parent(s): 7b29424

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -59,6 +59,9 @@ SFT LoRA обучение было выполнено на **двух NVIDIA A10
 - Liger Kernel (swiglu, fused linear xentropy)
 **GPU hours**: ~384 of NVIDIA A100
 ### Training configuration / Конфигурация обучения
 **The model was trained using MyLLM framework:**

 - Liger Kernel (swiglu, fused linear xentropy)
 **GPU hours**: ~384 of NVIDIA A100
+**GPU mem**:
+- Stage 1: 50-55GB of VRAM (both GPUs)
+- Stage 2: 79GB of VRAM (both GPUs)
 ### Training configuration / Конфигурация обучения
 **The model was trained using MyLLM framework:**