tangledgroup
/

tangled-alpha-0.9-core

Text Generation

Model card Files Files and versions Community

mtasic85 commited on 16 days ago

Commit

f747e58

·

1 Parent(s): fd25030

cpt core 4

Files changed (1) hide show

README.md +29 -0

README.md CHANGED Viewed

@@ -411,5 +411,34 @@ CUDA_VISIBLE_DEVICES=0 python -B cpt_core_model_4.py
 ```
 ```
 # ...
 ```

 ```
 ```
+🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
+🦥 Unsloth Zoo will now patch everything to make training faster!
+==((====))==  Unsloth 2025.3.14: Fast Llama patching. Transformers: 4.49.0.
+   \\   /|    NVIDIA GeForce RTX 3090. Num GPUs = 1. Max memory: 23.58 GB. Platform: Linux.
+O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 8.6. CUDA Toolkit: 12.4. Triton: 3.2.0
+\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29.post3. FA2 = False]
+ "-____-"     Free license: http://github.com/unslothai/unsloth
+Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
+Unsloth 2025.3.14 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.
+Unsloth: Training embed_tokens in mixed precision to save VRAM
+Unsloth: Training lm_head in mixed precision to save VRAM
+==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
+   \\   /|    Num examples = 37,031 | Num Epochs = 2 | Total steps = 37,406
+O^O/ \_/ \    Batch size per device = 1 | Gradient accumulation steps = 1
+\        /    Data Parallel GPUs = 1 | Total batch size (1 x 1 x 1) = 1
+ "-____-"     Trainable parameters = 247,463,936/482,378,240 (51.30% trained)
+wandb: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.
+wandb: Currently logged in as: mtasic85 to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
+wandb: Tracking run with wandb version 0.19.8
+wandb: Run data is saved locally in /home/tangled/tangled-alpha-0.9-core/scripts/wandb/run-20250315_163140-55plsawx
+wandb: Run `wandb offline` to turn off syncing.
+wandb: Syncing run cpt-core-4
+wandb: ⭐️ View project at https://wandb.ai/mtasic85/tangled-alpha-0.9-core
+wandb: 🚀 View run at https://wandb.ai/mtasic85/tangled-alpha-0.9-core/runs/55plsawx
+{'loss': 1.9674, 'grad_norm': 3.058457851409912, 'learning_rate': 4.999999991182871e-05, 'epoch': 0.0}
+  0%|                                                                                                                                                                                             | 1/37406 [00:03<32:07:27,  3.09s/it]
+Unsloth: Will smartly offload gradients to save VRAM!
+{'loss': 3.4846, 'grad_norm': 1.5452028512954712, 'learning_rate': 4.999999964731482e-05, 'epoch': 0.0}
+  0%|▎                                                                                                                                                                                           | 67/37406 [02:27<23:06:36,  2.23s/it]
 # ...
 ```