cwenger commited on
Commit
bbd5f04
·
verified ·
1 Parent(s): bbb814d

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +95 -62
README.md CHANGED
@@ -1,63 +1,96 @@
1
- ---
2
- base_model: jbochi/madlad400-3b-mt
3
- library_name: peft
4
- license: apache-2.0
5
- tags:
6
- - generated_from_trainer
7
- model-index:
8
- - name: madlad400-finetuned-mbk-tpi
9
- results: []
10
- ---
11
-
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
-
15
- # madlad400-finetuned-mbk-tpi
16
-
17
- This model is a fine-tuned version of [jbochi/madlad400-3b-mt](https://huggingface.co/jbochi/madlad400-3b-mt) on the None dataset.
18
- It achieves the following results on the evaluation set:
19
- - Loss: 0.1783
20
- - Chrf: 79.0009
21
-
22
- ## Model description
23
-
24
- More information needed
25
-
26
- ## Intended uses & limitations
27
-
28
- More information needed
29
-
30
- ## Training and evaluation data
31
-
32
- More information needed
33
-
34
- ## Training procedure
35
-
36
- ### Training hyperparameters
37
-
38
- The following hyperparameters were used during training:
39
- - learning_rate: 0.0005
40
- - train_batch_size: 4
41
- - eval_batch_size: 32
42
- - seed: 42
43
- - gradient_accumulation_steps: 8
44
- - total_train_batch_size: 32
45
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
- - lr_scheduler_type: linear
47
- - lr_scheduler_warmup_ratio: 0.1
48
- - num_epochs: 10.0
49
-
50
- ### Training results
51
-
52
- | Training Loss | Epoch | Step | Validation Loss | Chrf |
53
- |:-------------:|:------:|:----:|:---------------:|:-------:|
54
- | 0.2957 | 7.7108 | 1600 | 0.2136 | 76.6433 |
55
-
56
-
57
- ### Framework versions
58
-
59
- - PEFT 0.12.0
60
- - Transformers 4.44.2
61
- - Pytorch 2.4.1+cu124
62
- - Datasets 2.21.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  - Tokenizers 0.19.1
 
1
+ ---
2
+ base_model: jbochi/madlad400-3b-mt
3
+ library_name: peft
4
+ license: apache-2.0
5
+ tags:
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: madlad400-finetuned-mbk-tpi
9
+ results: []
10
+ language:
11
+ - mbk
12
+ - tpi
13
+ model_type: Translation
14
+ pipeline_tag: translation
15
+ ---
16
+
17
+ # madlad400-finetuned-mbk-tpi
18
+
19
+ This model is a fine-tuned version of `jbochi/madlad400-3b-mt` for translation from Malol to Tok Pisin.
20
+
21
+ ## Model details
22
+
23
+ - **Developed by:** SIL Global
24
+ - **Finetuned from model:** jbochi/madlad400-3b-mt
25
+ - **Model type:** Translation
26
+ - **Source language:** Malol (`mbk`)
27
+ - **Target language:** Tok Pisin (`tpi`)
28
+ - **License:** closed/private
29
+
30
+ ## Datasets
31
+
32
+ The model was trained on a parallel corpus of plain text files:
33
+
34
+ Malol:
35
+ - Malol Scriptures
36
+ - License: All rights reserved, Wycliffe Bible Translators. Used with permission.
37
+
38
+ Tok Pisin:
39
+ - Tok Pisin back-translation
40
+ - License: All rights reserved, Wycliffe Bible Translators. Used with permission.
41
+
42
+ ## Usage
43
+
44
+ You can use this model with the `transformers` library like this:
45
+
46
+ ```python
47
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
48
+
49
+ tokenizer = AutoTokenizer.from_pretrained("sil-ai/madlad400-finetuned-mbk-tpi")
50
+ model = AutoModelForSeq2SeqLM.from_pretrained("sil-ai/madlad400-finetuned-mbk-tpi")
51
+
52
+ inputs = tokenizer("Your input text here", return_tensors="pt")
53
+ outputs = model.generate(**inputs)
54
+ print(tokenizer.decode(outputs[0]))
55
+ ```
56
+
57
+
58
+ # madlad400-finetuned-mbk-tpi
59
+
60
+ This model is a fine-tuned version of [jbochi/madlad400-3b-mt](https://huggingface.co/jbochi/madlad400-3b-mt) on the None dataset.
61
+ It achieves the following results on the evaluation set:
62
+ - Loss: 0.1783
63
+ - Chrf: 79.0009
64
+
65
+
66
+
67
+ ## Training procedure
68
+
69
+ ### Training hyperparameters
70
+
71
+ The following hyperparameters were used during training:
72
+ - learning_rate: 0.0005
73
+ - train_batch_size: 4
74
+ - eval_batch_size: 32
75
+ - seed: 42
76
+ - gradient_accumulation_steps: 8
77
+ - total_train_batch_size: 32
78
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
79
+ - lr_scheduler_type: linear
80
+ - lr_scheduler_warmup_ratio: 0.1
81
+ - num_epochs: 10.0
82
+
83
+ ### Training results
84
+
85
+ | Training Loss | Epoch | Step | Validation Loss | Chrf |
86
+ |:-------------:|:------:|:----:|:---------------:|:-------:|
87
+ | 0.2957 | 7.7108 | 1600 | 0.2136 | 76.6433 |
88
+
89
+
90
+ ### Framework versions
91
+
92
+ - PEFT 0.12.0
93
+ - Transformers 4.44.2
94
+ - Pytorch 2.4.1+cu124
95
+ - Datasets 2.21.0
96
  - Tokenizers 0.19.1