SalomonMetre13 commited on
Commit
cc9ec89
·
verified ·
1 Parent(s): efc8bcd

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -0
README.md ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - luo
5
+ - fr
6
+ metrics:
7
+ - bleu
8
+ base_model:
9
+ - facebook/nllb-200-distilled-600M
10
+ pipeline_tag: translation
11
+ ---
12
+ Here's a model card/README for the provided script, which fine-tunes the NLLB model for translating between Luo and Swahili:
13
+
14
+ ---
15
+
16
+ # Luo-Swahili Machine Translation Model (NLLB-based)
17
+
18
+ ## Model Details
19
+
20
+ - **Model Name**: `nllb-luo-swa-mt-v1`
21
+ - **Base Model**: `facebook/nllb-200-distilled-600M`
22
+ - **Language Pair**: Luo (`luo`) to Swahili (`swa`)
23
+ - **Dataset**: `SalomonMetre13/luo_swa_arXiv_2501.11003`
24
+
25
+ ## Description
26
+
27
+ This model is fine-tuned for translating text from Luo to Swahili using the NLLB-200 model architecture. The fine-tuning process involves extending the tokenizer's vocabulary with custom language tokens and training the model on a specific dataset.
28
+
29
+ ## Features
30
+
31
+ - **Custom Tokenizer**: Extended with special tokens for Luo and Swahili.
32
+ - **Training**: Fine-tuned on a dataset specifically curated for Luo-Swahili translation.
33
+ - **Evaluation**: Uses BLEU score for performance evaluation.
34
+ - **Inference**: Capable of translating new sentences and batches of text.
35
+
36
+ ## Usage
37
+
38
+ ### Installation
39
+
40
+ Ensure you have the necessary libraries installed:
41
+
42
+ ```bash
43
+ pip install datasets transformers sacrebleu huggingface_hub accelerate torch
44
+ ```
45
+
46
+ ### Fine-Tuning
47
+
48
+ 1. **Authentication**: Log in to Hugging Face to access the model and dataset.
49
+ 2. **Preprocessing**: The dataset is preprocessed to include special language tokens.
50
+ 3. **Training**: The model is fine-tuned using the `Seq2SeqTrainer` with specified training arguments.
51
+ 4. **Evaluation**: The model's performance is evaluated using the BLEU metric.
52
+
53
+ ### Inference
54
+
55
+ - **Translate Single Sentence**: Use the `translate_custom_sentence` function to translate individual sentences.
56
+ - **Translate Batch**: Use the `translate_batch` function for batch translation.
57
+
58
+ ## Performance
59
+
60
+ The model's performance is evaluated using the BLEU score on the test set. The BLEU score provides an indication of the translation quality.
61
+
62
+ ## Limitations
63
+
64
+ - The model is trained with a maximum input length of 512 tokens, which may limit its effectiveness on longer texts.
65
+ - The dataset used for fine-tuning may influence the model's performance on specific domains or styles of text.
66
+
67
+ ## Future Work
68
+
69
+ - Explore fine-tuning on additional datasets to improve robustness.
70
+ - Experiment with different training parameters and architectures to enhance performance.
71
+
72
+ ## Contact
73
+
74
+ For questions or feedback, please contact [Your Contact Information].
75
+
76
+ ---
77
+
78
+ This README provides a comprehensive overview of the model, its features, usage instructions, performance metrics, and limitations. It serves as a guide for users who wish to fine-tune or use the model for Luo-Swahili translation tasks.