Model save

Browse files

Files changed (4) hide show

README.md +24 -15
generation_config.json +1 -2
model.safetensors +1 -1
runs/Sep23_04-52-24_ip-10-192-10-43/events.out.tfevents.1727067444.ip-10-192-10-43.2242.0 +2 -2

README.md CHANGED Viewed

@@ -1,6 +1,10 @@
 ---
 tags:
 - generated_from_trainer
 model-index:
 - name: indictrans-en-ne-checkpoint-1B
   results: []
@@ -11,9 +15,12 @@ should probably proofread and complete it, then remove this comment. -->
 # indictrans-en-ne-checkpoint-1B
-This model was trained from scratch on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 7.1008
 ## Model description
@@ -32,27 +39,29 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
 - train_batch_size: 32
-- eval_batch_size: 2
 - seed: 42
-- gradient_accumulation_steps: 64
-- total_train_batch_size: 2048
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
 - num_epochs: 1
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 7.1087        | 0.4636 | 200  | 8.0102          |
-| 6.3473        | 0.9272 | 400  | 7.1008          |
 ### Framework versions
-- Transformers 4.43.3
-- Pytorch 2.5.0.dev20240819+cu118
-- Datasets 2.21.0
 - Tokenizers 0.19.1

 ---
+library_name: transformers
+base_model: indictrans-en-ne-checkpoint-1B/checkpoint-431
 tags:
 - generated_from_trainer
+metrics:
+- bleu
 model-index:
 - name: indictrans-en-ne-checkpoint-1B
   results: []
 # indictrans-en-ne-checkpoint-1B
+This model is a fine-tuned version of [indictrans-en-ne-checkpoint-1B/checkpoint-431](https://huggingface.co/indictrans-en-ne-checkpoint-1B/checkpoint-431) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1392
+- Bleu: 36.0128
+- Chrf: 60.1977
+- Num Input Tokens Seen: 196608000
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0002
 - train_batch_size: 32
+- eval_batch_size: 16
 - seed: 42
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 256
+- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
+- lr_scheduler_type: inverse_sqrt
+- lr_scheduler_warmup_steps: 500
 - num_epochs: 1
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Bleu    | Chrf    | Input Tokens Seen |
+|:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|:-----------------:|
+| 0.1655        | 0.2897 | 1000 | 0.1563          | 32.3875 | 57.9890 | 65536000          |
+| 0.1545        | 0.5795 | 2000 | 0.1433          | 35.2047 | 59.6328 | 131072000         |
+| 0.1471        | 0.8692 | 3000 | 0.1392          | 36.0128 | 60.1977 | 196608000         |
 ### Framework versions
+- Transformers 4.44.2
+- Pytorch 2.2.1+cu121
+- Datasets 3.0.0
 - Tokenizers 0.19.1

generation_config.json CHANGED Viewed

@@ -1,8 +1,7 @@
 {
-  "_from_model_config": true,
   "bos_token_id": 0,
   "decoder_start_token_id": 2,
   "eos_token_id": 2,
   "pad_token_id": 1,
-  "transformers_version": "4.43.3"
 }

 {
   "bos_token_id": 0,
   "decoder_start_token_id": 2,
   "eos_token_id": 2,
   "pad_token_id": 1,
+  "transformers_version": "4.44.2"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:767d70bae1732e7938c521b21274cfac84f727a3d578d4a0a1267ba14ee63334
 size 2247492800

 version https://git-lfs.github.com/spec/v1
+oid sha256:cb7087096dc7f19e4f410ffc1bf520060b25c186d40fbde2534e352f8adaa33d
 size 2247492800

runs/Sep23_04-52-24_ip-10-192-10-43/events.out.tfevents.1727067444.ip-10-192-10-43.2242.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5a918ac068ba9cf08f666ade09c95a0e65c850b3248c1d53bd6741a6f9588c00
-size 16541

 version https://git-lfs.github.com/spec/v1
+oid sha256:c25785389f2f41ab943b1ab184b9008298333962d74f6808b2f2f62387ddb896
+size 17029