Text Generation
Transformers
Safetensors
Russian
qwen2
conversational
text-generation-inference
attn-signs commited on
Commit
393f594
·
verified ·
1 Parent(s): 7b29424

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -59,6 +59,9 @@ SFT LoRA обучение было выполнено на **двух NVIDIA A10
59
  - Liger Kernel (swiglu, fused linear xentropy)
60
 
61
  **GPU hours**: ~384 of NVIDIA A100
 
 
 
62
 
63
  ### Training configuration / Конфигурация обучения
64
  **The model was trained using MyLLM framework:**
 
59
  - Liger Kernel (swiglu, fused linear xentropy)
60
 
61
  **GPU hours**: ~384 of NVIDIA A100
62
+ **GPU mem**:
63
+ - Stage 1: 50-55GB of VRAM (both GPUs)
64
+ - Stage 2: 79GB of VRAM (both GPUs)
65
 
66
  ### Training configuration / Конфигурация обучения
67
  **The model was trained using MyLLM framework:**