Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,78 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
language:
|
4 |
+
- luo
|
5 |
+
- fr
|
6 |
+
metrics:
|
7 |
+
- bleu
|
8 |
+
base_model:
|
9 |
+
- facebook/nllb-200-distilled-600M
|
10 |
+
pipeline_tag: translation
|
11 |
+
---
|
12 |
+
Here's a model card/README for the provided script, which fine-tunes the NLLB model for translating between Luo and Swahili:
|
13 |
+
|
14 |
+
---
|
15 |
+
|
16 |
+
# Luo-Swahili Machine Translation Model (NLLB-based)
|
17 |
+
|
18 |
+
## Model Details
|
19 |
+
|
20 |
+
- **Model Name**: `nllb-luo-swa-mt-v1`
|
21 |
+
- **Base Model**: `facebook/nllb-200-distilled-600M`
|
22 |
+
- **Language Pair**: Luo (`luo`) to Swahili (`swa`)
|
23 |
+
- **Dataset**: `SalomonMetre13/luo_swa_arXiv_2501.11003`
|
24 |
+
|
25 |
+
## Description
|
26 |
+
|
27 |
+
This model is fine-tuned for translating text from Luo to Swahili using the NLLB-200 model architecture. The fine-tuning process involves extending the tokenizer's vocabulary with custom language tokens and training the model on a specific dataset.
|
28 |
+
|
29 |
+
## Features
|
30 |
+
|
31 |
+
- **Custom Tokenizer**: Extended with special tokens for Luo and Swahili.
|
32 |
+
- **Training**: Fine-tuned on a dataset specifically curated for Luo-Swahili translation.
|
33 |
+
- **Evaluation**: Uses BLEU score for performance evaluation.
|
34 |
+
- **Inference**: Capable of translating new sentences and batches of text.
|
35 |
+
|
36 |
+
## Usage
|
37 |
+
|
38 |
+
### Installation
|
39 |
+
|
40 |
+
Ensure you have the necessary libraries installed:
|
41 |
+
|
42 |
+
```bash
|
43 |
+
pip install datasets transformers sacrebleu huggingface_hub accelerate torch
|
44 |
+
```
|
45 |
+
|
46 |
+
### Fine-Tuning
|
47 |
+
|
48 |
+
1. **Authentication**: Log in to Hugging Face to access the model and dataset.
|
49 |
+
2. **Preprocessing**: The dataset is preprocessed to include special language tokens.
|
50 |
+
3. **Training**: The model is fine-tuned using the `Seq2SeqTrainer` with specified training arguments.
|
51 |
+
4. **Evaluation**: The model's performance is evaluated using the BLEU metric.
|
52 |
+
|
53 |
+
### Inference
|
54 |
+
|
55 |
+
- **Translate Single Sentence**: Use the `translate_custom_sentence` function to translate individual sentences.
|
56 |
+
- **Translate Batch**: Use the `translate_batch` function for batch translation.
|
57 |
+
|
58 |
+
## Performance
|
59 |
+
|
60 |
+
The model's performance is evaluated using the BLEU score on the test set. The BLEU score provides an indication of the translation quality.
|
61 |
+
|
62 |
+
## Limitations
|
63 |
+
|
64 |
+
- The model is trained with a maximum input length of 512 tokens, which may limit its effectiveness on longer texts.
|
65 |
+
- The dataset used for fine-tuning may influence the model's performance on specific domains or styles of text.
|
66 |
+
|
67 |
+
## Future Work
|
68 |
+
|
69 |
+
- Explore fine-tuning on additional datasets to improve robustness.
|
70 |
+
- Experiment with different training parameters and architectures to enhance performance.
|
71 |
+
|
72 |
+
## Contact
|
73 |
+
|
74 |
+
For questions or feedback, please contact [Your Contact Information].
|
75 |
+
|
76 |
+
---
|
77 |
+
|
78 |
+
This README provides a comprehensive overview of the model, its features, usage instructions, performance metrics, and limitations. It serves as a guide for users who wish to fine-tune or use the model for Luo-Swahili translation tasks.
|