hindi-colloquial-translator

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 13.9788

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
14.9448 0.0333 10 12.0851
10.5162 0.0667 20 12.1086
6.5944 0.1 30 12.1206
4.5955 0.1333 40 12.1867
3.9384 0.1667 50 12.3252
3.7184 0.2 60 12.6046
3.596 0.2333 70 13.1815
3.5198 0.2667 80 13.3539
3.6022 0.3 90 12.9698
3.5499 0.3333 100 13.1654
3.5272 0.3667 110 13.3949
3.5894 0.4 120 13.4733
3.5825 0.4333 130 13.6201
3.5728 0.4667 140 13.6792
3.5061 0.5 150 13.6643
3.6101 0.5333 160 13.6148
3.557 0.5667 170 13.6482
3.5036 0.6 180 13.6264
3.4877 0.6333 190 13.6439
3.5853 0.6667 200 13.6861
3.5512 0.7 210 13.6245
3.5651 0.7333 220 13.7202
3.6101 0.7667 230 13.7473
3.5352 0.8 240 13.7057
3.5732 0.8333 250 13.7293
3.5483 0.8667 260 13.8731
3.5032 0.9 270 13.9614
3.5584 0.9333 280 14.0209
3.4704 0.9667 290 14.0475
3.5769 1.0 300 13.9849
3.5266 1.0333 310 13.9310
3.5559 1.0667 320 13.9112
3.522 1.1 330 13.9922
3.5856 1.1333 340 14.0535
3.5635 1.1667 350 14.0731
3.5347 1.2 360 14.0378
3.5998 1.2333 370 14.0262
3.5353 1.2667 380 13.9983
3.5461 1.3 390 13.9643
3.5471 1.3333 400 13.9684
3.5621 1.3667 410 13.9681
3.497 1.4 420 13.9489
3.4891 1.4333 430 13.9449
3.5001 1.4667 440 13.9471
3.4829 1.5 450 13.9471
3.5042 1.5333 460 13.9469
3.5616 1.5667 470 13.9399
3.5716 1.6 480 13.9463
3.5792 1.6333 490 13.9335
3.5115 1.6667 500 13.9425
3.432 1.7 510 13.9456
3.6118 1.7333 520 13.9393
3.53 1.7667 530 13.9433
3.5557 1.8 540 13.9360
3.5061 1.8333 550 13.9372
3.5141 1.8667 560 13.9420
3.5219 1.9 570 13.9484
3.5528 1.9333 580 13.9506
3.5767 1.9667 590 13.9557
3.576 2.0 600 13.9629
3.5206 2.0333 610 13.9666
3.5492 2.0667 620 13.9642
3.4966 2.1 630 13.9700
3.5562 2.1333 640 13.9725
3.4972 2.1667 650 13.9789
3.5234 2.2 660 13.9711
3.53 2.2333 670 13.9741
3.553 2.2667 680 13.9691
3.5519 2.3 690 13.9713
3.5854 2.3333 700 13.9782
3.4821 2.3667 710 13.9779
3.501 2.4 720 13.9834
3.5755 2.4333 730 13.9854
3.5215 2.4667 740 13.9791
3.5579 2.5 750 13.9823
3.5026 2.5333 760 13.9829
3.5423 2.5667 770 13.9793
3.5334 2.6 780 13.9822
3.5388 2.6333 790 13.9845
3.5442 2.6667 800 13.9853
3.5159 2.7 810 13.9844
3.4691 2.7333 820 13.9812
3.5149 2.7667 830 13.9821
3.5756 2.8 840 13.9809
3.552 2.8333 850 13.9823
3.5843 2.8667 860 13.9786
3.5026 2.9 870 13.9794
3.5703 2.9333 880 13.9802
3.5147 2.9667 890 13.9810
3.5719 3.0 900 13.9788

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for smita1988/english-hindi-translation-model

Adapter
(112)
this model