SalomonMetre13 commited on
Commit
87f33fb
·
verified ·
1 Parent(s): 8d6c06d

End of training

Browse files
README.md CHANGED
@@ -1,111 +1,57 @@
1
  ---
2
- license: mit
3
- language:
4
- - luo
5
- - fr
6
- metrics:
7
- - bleu
8
- base_model:
9
- - facebook/nllb-200-distilled-600M
10
- pipeline_tag: translation
11
  ---
12
- # Luo-Swahili Machine Translation Model (NLLB-based)
13
 
14
- ## Model Details
 
15
 
16
- - **Model Name**: `nllb-luo-swa-mt-v1`
17
- - **Base Model**: `facebook/nllb-200-distilled-600M`
18
- - **Language Pair**: Luo (`luo`) to Swahili (`swa`)
19
- - **Dataset**: [SalomonMetre13/luo_swa_arXiv_2501.11003](https://huggingface.co/datasets/SalomonMetre13/luo_swa_arXiv_2501.11003)
20
 
21
- ## Description
 
 
 
 
 
 
 
22
 
23
- This model is fine-tuned for translating text from Luo to Swahili using the NLLB-200 model architecture. The fine-tuning process involves extending the tokenizer's vocabulary with custom language tokens and training the model on a specific dataset.
24
 
25
- ## Features
26
 
27
- - **Custom Tokenizer**: Extended with special tokens for Luo and Swahili.
28
- - **Training**: Fine-tuned on a dataset specifically curated for Luo-Swahili translation.
29
- - **Evaluation**: Uses BLEU score for performance evaluation.
30
- - **Inference**: Capable of translating new sentences and batches of text.
31
 
32
- ## Usage
33
 
34
- ### Installation
35
 
36
- Ensure you have the necessary libraries installed:
37
 
38
- ```bash
39
- pip install datasets transformers sacrebleu huggingface_hub accelerate torch
40
- ```
41
 
42
- ### Fine-Tuning
43
 
44
- 1. **Authentication**: Log in to Hugging Face to access the model and dataset.
45
- 2. **Preprocessing**: The dataset is preprocessed to include special language tokens.
46
- 3. **Training**: The model is fine-tuned using the `Seq2SeqTrainer` with specified training arguments.
47
- 4. **Evaluation**: The model's performance is evaluated using the BLEU metric.
 
 
 
 
 
48
 
49
- ### Inference
50
 
51
- #### Translate Single Sentence
52
-
53
- To translate a single sentence from Luo to Swahili, use the `translate_custom_sentence` function. Here's an example:
54
-
55
- ```python
56
- from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
57
-
58
- # Load the fine-tuned model and tokenizer
59
- model_name = "SalomonMetre13/nllb-luo-swa-mt-v1"
60
- tokenizer = AutoTokenizer.from_pretrained(model_name)
61
- model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
62
-
63
- def translate_custom_sentence(src_text: str, src_lang: str = "luo", tgt_lang: str = "swa") -> str:
64
- formatted_text = f"<{src_lang}> {src_text.strip()}"
65
-
66
- # Tokenize input
67
- inputs = tokenizer(
68
- formatted_text,
69
- return_tensors="pt",
70
- max_length=128,
71
- truncation=True
72
- ).to(model.device)
73
-
74
- # Generate translation
75
- outputs = model.generate(
76
- inputs.input_ids,
77
- forced_bos_token_id=tokenizer.convert_tokens_to_ids(f"<{tgt_lang}>"),
78
- max_length=150
79
- )
80
-
81
- # Decode and clean output
82
- return tokenizer.decode(outputs[0], skip_special_tokens=True)
83
-
84
- # Example usage
85
- src_text = "Le feu crépite dans la nuit silencieuse."
86
- translation = translate_custom_sentence(src_text)
87
- print(f"Luo: {src_text}")
88
- print(f"Swahili: {translation}")
89
- ```
90
-
91
- #### Translate Batch
92
-
93
- For batch translation, use the `translate_batch` function. This function processes multiple sentences at once, which can be more efficient for larger translation tasks.
94
-
95
- ## Performance
96
-
97
- The model's performance is evaluated using the BLEU score on the test set. The BLEU score provides an indication of the translation quality.
98
-
99
- ## Limitations
100
-
101
- - The model is trained with a maximum input length of 512 tokens, which may limit its effectiveness on longer texts.
102
- - The dataset used for fine-tuning may influence the model's performance on specific domains or styles of text.
103
-
104
- ## Future Work
105
-
106
- - Explore fine-tuning on additional datasets to improve robustness.
107
- - Experiment with different training parameters and architectures to enhance performance.
108
-
109
- ## Contact
110
-
111
- For questions or feedback, please contact [[email protected]].
 
1
  ---
2
+ library_name: transformers
3
+ license: cc-by-nc-4.0
4
+ base_model: facebook/nllb-200-distilled-600M
5
+ tags:
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: nllb-luo-swa-mt-v1
9
+ results: []
 
10
  ---
 
11
 
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
 
15
+ # nllb-luo-swa-mt-v1
 
 
 
16
 
17
+ This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - eval_loss: 0.0895
20
+ - eval_runtime: 178.9825
21
+ - eval_samples_per_second: 16.354
22
+ - eval_steps_per_second: 8.18
23
+ - epoch: 0.4936
24
+ - step: 6500
25
 
26
+ ## Model description
27
 
28
+ More information needed
29
 
30
+ ## Intended uses & limitations
 
 
 
31
 
32
+ More information needed
33
 
34
+ ## Training and evaluation data
35
 
36
+ More information needed
37
 
38
+ ## Training procedure
 
 
39
 
40
+ ### Training hyperparameters
41
 
42
+ The following hyperparameters were used during training:
43
+ - learning_rate: 3e-05
44
+ - train_batch_size: 2
45
+ - eval_batch_size: 2
46
+ - seed: 42
47
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
+ - lr_scheduler_type: linear
49
+ - num_epochs: 1
50
+ - mixed_precision_training: Native AMP
51
 
52
+ ### Framework versions
53
 
54
+ - Transformers 4.50.3
55
+ - Pytorch 2.6.0+cu124
56
+ - Datasets 3.5.0
57
+ - Tokenizers 0.21.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
generation_config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 0,
4
+ "decoder_start_token_id": 2,
5
+ "eos_token_id": 2,
6
+ "max_length": 200,
7
+ "pad_token_id": 1,
8
+ "transformers_version": "4.50.3"
9
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5f80afe2a4a689b253a41bcdc52e878cb3fbaefe1f1934db40a20678d3954c9b
3
  size 2460354912
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c6846789c56069fe8bc1dc0d95539b148f13ebce2581fe990d7829830922599a
3
  size 2460354912
runs/Apr12_11-39-52_495dcc39ad78/events.out.tfevents.1744458014.495dcc39ad78.3873.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:87c784916c19d5a9dfcbac8e5835165f75142b052e97ce8ca568d570e865bcfe
3
- size 22675
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c8d952b9fa912e2c9f8b28b2353321432e61ca9e2554303377986433e62552a
3
+ size 23730