mtasic85 commited on
Commit
f747e58
·
1 Parent(s): fd25030

cpt core 4

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -411,5 +411,34 @@ CUDA_VISIBLE_DEVICES=0 python -B cpt_core_model_4.py
411
  ```
412
 
413
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
414
  # ...
415
  ```
 
411
  ```
412
 
413
  ```
414
+ 🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
415
+ 🦥 Unsloth Zoo will now patch everything to make training faster!
416
+ ==((====))== Unsloth 2025.3.14: Fast Llama patching. Transformers: 4.49.0.
417
+ \\ /| NVIDIA GeForce RTX 3090. Num GPUs = 1. Max memory: 23.58 GB. Platform: Linux.
418
+ O^O/ \_/ \ Torch: 2.6.0+cu124. CUDA: 8.6. CUDA Toolkit: 12.4. Triton: 3.2.0
419
+ \ / Bfloat16 = TRUE. FA [Xformers = 0.0.29.post3. FA2 = False]
420
+ "-____-" Free license: http://github.com/unslothai/unsloth
421
+ Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
422
+ Unsloth 2025.3.14 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.
423
+ Unsloth: Training embed_tokens in mixed precision to save VRAM
424
+ Unsloth: Training lm_head in mixed precision to save VRAM
425
+ ==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1
426
+ \\ /| Num examples = 37,031 | Num Epochs = 2 | Total steps = 37,406
427
+ O^O/ \_/ \ Batch size per device = 1 | Gradient accumulation steps = 1
428
+ \ / Data Parallel GPUs = 1 | Total batch size (1 x 1 x 1) = 1
429
+ "-____-" Trainable parameters = 247,463,936/482,378,240 (51.30% trained)
430
+ wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
431
+ wandb: Currently logged in as: mtasic85 to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
432
+ wandb: Tracking run with wandb version 0.19.8
433
+ wandb: Run data is saved locally in /home/tangled/tangled-alpha-0.9-core/scripts/wandb/run-20250315_163140-55plsawx
434
+ wandb: Run `wandb offline` to turn off syncing.
435
+ wandb: Syncing run cpt-core-4
436
+ wandb: ⭐️ View project at https://wandb.ai/mtasic85/tangled-alpha-0.9-core
437
+ wandb: 🚀 View run at https://wandb.ai/mtasic85/tangled-alpha-0.9-core/runs/55plsawx
438
+ {'loss': 1.9674, 'grad_norm': 3.058457851409912, 'learning_rate': 4.999999991182871e-05, 'epoch': 0.0}
439
+ 0%| | 1/37406 [00:03<32:07:27, 3.09s/it]
440
+ Unsloth: Will smartly offload gradients to save VRAM!
441
+ {'loss': 3.4846, 'grad_norm': 1.5452028512954712, 'learning_rate': 4.999999964731482e-05, 'epoch': 0.0}
442
+ 0%|▎ | 67/37406 [02:27<23:06:36, 2.23s/it]
443
  # ...
444
  ```