SalomonMetre13 commited on
Commit
ba7f724
·
verified ·
1 Parent(s): 8c67cb4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -45
README.md CHANGED
@@ -1,59 +1,67 @@
1
- ---
2
- library_name: transformers
3
- license: cc-by-nc-4.0
4
- base_model: SalomonMetre13/nllb-luo-swa-mt-v1
5
- tags:
6
- - generated_from_trainer
7
- model-index:
8
- - name: nllb-luo-swa-mt-v1
9
- results: []
10
- ---
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
 
15
- # nllb-luo-swa-mt-v1
 
 
 
 
 
 
16
 
17
- This model is a fine-tuned version of [SalomonMetre13/nllb-luo-swa-mt-v1](https://huggingface.co/SalomonMetre13/nllb-luo-swa-mt-v1) on an unknown dataset.
18
- It achieves the following results on the evaluation set:
19
- - eval_loss: 0.1146
20
- - eval_bleu: 19.64
21
- - eval_runtime: 798.5876
22
- - eval_samples_per_second: 3.665
23
- - eval_steps_per_second: 0.917
24
- - epoch: 0.4556
25
- - step: 3000
26
 
27
- ## Model description
28
 
29
- More information needed
30
 
31
- ## Intended uses & limitations
 
 
 
 
 
32
 
33
- More information needed
 
 
 
34
 
35
- ## Training and evaluation data
36
 
37
- More information needed
38
 
39
- ## Training procedure
40
 
41
- ### Training hyperparameters
 
42
 
43
- The following hyperparameters were used during training:
44
- - learning_rate: 3e-05
45
- - train_batch_size: 4
46
- - eval_batch_size: 4
47
- - seed: 42
48
- - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
- - lr_scheduler_type: linear
50
- - lr_scheduler_warmup_steps: 200
51
- - num_epochs: 10
52
- - mixed_precision_training: Native AMP
53
 
54
- ### Framework versions
 
55
 
56
- - Transformers 4.51.3
57
- - Pytorch 2.6.0+cu124
58
- - Datasets 3.5.1
59
- - Tokenizers 0.21.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Card for nllb-luo-swa-mt-v1
 
 
 
 
 
 
 
 
 
2
 
3
+ ## Model Overview
 
4
 
5
+ **Model Name**: nllb-luo-swa-mt-v1
6
+ **Model Type**: Machine Translation (Luo (Dholuo) to Swahili)
7
+ **Base Model**: NLLB-200-distilled-600M
8
+ **Languages**: Luo (Dholuo), Swahili
9
+ **Version**: 1.0
10
+ **License**: CC0 (Public Domain)
11
+ **Dataset**: [SalomonMetre13/luo_swa_arXiv_2501.11003](https://huggingface.co/datasets/SalomonMetre13/luo_swa_arXiv_2501.11003)
12
 
13
+ This model is a fine-tuned version of the `NLLB-200-distilled-600M` model for translation between Luo (Dholuo) and Swahili. It was trained on a parallel corpus derived from the Dholuo–Swahili corpus created by Mbogho et al. (2025), based on community-driven data collection efforts.
 
 
 
 
 
 
 
 
14
 
15
+ ## Model Description
16
 
17
+ The `nllb-luo-swa-mt-v1` model performs machine translation from Luo (Dholuo) to Swahili, designed to improve translation capabilities for these low-resource languages. It was fine-tuned using the parallel corpus from the paper **"Building low-resource African language corpora: A case study of Kidawida, Kalenjin and Dholuo"** by Mbogho et al. (2025). This model is particularly valuable for promoting linguistic diversity and facilitating the development of Natural Language Processing (NLP) tools in African languages.
18
 
19
+ ### Key Features:
20
+ - **Training Data**: Fine-tuned on the Dholuo–Swahili parallel text corpus from the dataset [SalomonMetre13/luo_swa_arXiv_2501.11003](https://huggingface.co/datasets/SalomonMetre13/luo_swa_arXiv_2501.11003), derived from the grassroots data collection effort by Mbogho et al. (2025).
21
+ - **Performance**: Achieved a BLEU score of 21.56 on the evaluation set, showing strong performance in a low-resource setting.
22
+ - **Qualitative Analysis**: Translations generated by this model are sometimes more fluent and accurate than the provided reference translations.
23
+
24
+ ## Intended Use
25
 
26
+ This model can be used for machine translation applications between Luo (Dholuo) and Swahili. Potential use cases include:
27
+ - **Educational tools**: Enabling educational content in both languages, aiding language learners and teachers.
28
+ - **Public health and community development**: Translating health information, community messages, and official communications.
29
+ - **Cultural preservation**: Supporting the preservation and growth of the Luo language in the digital age.
30
 
31
+ ## Model Evaluation
32
 
33
+ The model was evaluated using the BLEU score, which is commonly used to assess machine translation performance. A BLEU score of 21.56 was achieved, which is a strong result for a low-resource language pair. Qualitative analysis of the translations suggests that, in some cases, the model's outputs outperform the reference translations in terms of fluency and accuracy.
34
 
35
+ ## Training Details
36
 
37
+ - **Training Data**: The model was trained on the Dholuo–Swahili parallel corpus, based on the dataset [SalomonMetre13/luo_swa_arXiv_2501.11003](https://huggingface.co/datasets/SalomonMetre13/luo_swa_arXiv_2501.11003) derived from Mbogho et al.'s (2025) work. The corpus includes text translations and is publicly available for further use and improvement.
38
+ - **Model Architecture**: The model is fine-tuned from the `NLLB-200-distilled-600M` version of the NLLB model family, which is designed for multilingual translation tasks.
39
 
40
+ ## Limitations
 
 
 
 
 
 
 
 
 
41
 
42
+ - **Low-Resource Context**: While the model performs well given the limited amount of data, its performance may still lag behind models trained on larger corpora for more widely spoken languages.
43
+ - **Domain-Specific Use**: The model may require additional fine-tuning to perform optimally on domain-specific text such as medical, legal, or technical content.
44
 
45
+ ## Future Directions
46
+
47
+ - **Expanding the Dataset**: The quality and coverage of the model could be improved by incorporating more diverse and larger datasets.
48
+ - **Additional Language Pairs**: Further fine-tuning of the model to support other language pairs involving Luo and Swahili could make the model even more versatile.
49
+ - **Real-World Applications**: The model could be applied to real-world projects such as translating educational materials, public health information, or community communication platforms.
50
+
51
+ ## Acknowledgements
52
+
53
+ This model was developed based on the Dholuo–Swahili parallel corpus created by Mbogho et al.\ (2025) as part of their work in building low-resource African language corpora. The corpus was made publicly available on platforms like Zenodo and Mozilla Common Voice.
54
+
55
+ ## How to Use
56
+
57
+ You can access the model via the Hugging Face Hub at:
58
+ [https://huggingface.co/SalomonMetre13/nllb-luo-swa-mt-v1](https://huggingface.co/SalomonMetre13/nllb-luo-swa-mt-v1)
59
+
60
+ To load the model using the Hugging Face `transformers` library, use the following code:
61
+
62
+ ```python
63
+ from transformers import pipeline
64
+
65
+ translator = pipeline("translation", model="SalomonMetre13/nllb-luo-swa-mt-v1")
66
+ translation = translator("Ninapenda kujua kuhusu lugha ya Dholuo.")
67
+ print(translation)