cpt core 4
Browse files
README.md
CHANGED
@@ -411,5 +411,34 @@ CUDA_VISIBLE_DEVICES=0 python -B cpt_core_model_4.py
|
|
411 |
```
|
412 |
|
413 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
414 |
# ...
|
415 |
```
|
|
|
411 |
```
|
412 |
|
413 |
```
|
414 |
+
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
|
415 |
+
🦥 Unsloth Zoo will now patch everything to make training faster!
|
416 |
+
==((====))== Unsloth 2025.3.14: Fast Llama patching. Transformers: 4.49.0.
|
417 |
+
\\ /| NVIDIA GeForce RTX 3090. Num GPUs = 1. Max memory: 23.58 GB. Platform: Linux.
|
418 |
+
O^O/ \_/ \ Torch: 2.6.0+cu124. CUDA: 8.6. CUDA Toolkit: 12.4. Triton: 3.2.0
|
419 |
+
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.29.post3. FA2 = False]
|
420 |
+
"-____-" Free license: http://github.com/unslothai/unsloth
|
421 |
+
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
|
422 |
+
Unsloth 2025.3.14 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.
|
423 |
+
Unsloth: Training embed_tokens in mixed precision to save VRAM
|
424 |
+
Unsloth: Training lm_head in mixed precision to save VRAM
|
425 |
+
==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1
|
426 |
+
\\ /| Num examples = 37,031 | Num Epochs = 2 | Total steps = 37,406
|
427 |
+
O^O/ \_/ \ Batch size per device = 1 | Gradient accumulation steps = 1
|
428 |
+
\ / Data Parallel GPUs = 1 | Total batch size (1 x 1 x 1) = 1
|
429 |
+
"-____-" Trainable parameters = 247,463,936/482,378,240 (51.30% trained)
|
430 |
+
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
|
431 |
+
wandb: Currently logged in as: mtasic85 to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
|
432 |
+
wandb: Tracking run with wandb version 0.19.8
|
433 |
+
wandb: Run data is saved locally in /home/tangled/tangled-alpha-0.9-core/scripts/wandb/run-20250315_163140-55plsawx
|
434 |
+
wandb: Run `wandb offline` to turn off syncing.
|
435 |
+
wandb: Syncing run cpt-core-4
|
436 |
+
wandb: ⭐️ View project at https://wandb.ai/mtasic85/tangled-alpha-0.9-core
|
437 |
+
wandb: 🚀 View run at https://wandb.ai/mtasic85/tangled-alpha-0.9-core/runs/55plsawx
|
438 |
+
{'loss': 1.9674, 'grad_norm': 3.058457851409912, 'learning_rate': 4.999999991182871e-05, 'epoch': 0.0}
|
439 |
+
0%| | 1/37406 [00:03<32:07:27, 3.09s/it]
|
440 |
+
Unsloth: Will smartly offload gradients to save VRAM!
|
441 |
+
{'loss': 3.4846, 'grad_norm': 1.5452028512954712, 'learning_rate': 4.999999964731482e-05, 'epoch': 0.0}
|
442 |
+
0%|▎ | 67/37406 [02:27<23:06:36, 2.23s/it]
|
443 |
# ...
|
444 |
```
|