End of training

Browse files

Files changed (4) hide show

README.md +41 -95
generation_config.json +9 -0
model.safetensors +1 -1
runs/Apr12_11-39-52_495dcc39ad78/events.out.tfevents.1744458014.495dcc39ad78.3873.0 +2 -2

README.md CHANGED Viewed

@@ -1,111 +1,57 @@
 ---
-license: mit
-language:
-- luo
-- fr
-metrics:
-- bleu
-base_model:
-- facebook/nllb-200-distilled-600M
-pipeline_tag: translation
 ---
-# Luo-Swahili Machine Translation Model (NLLB-based)
-## Model Details
-- **Model Name**: `nllb-luo-swa-mt-v1`
-- **Base Model**: `facebook/nllb-200-distilled-600M`
-- **Language Pair**: Luo (`luo`) to Swahili (`swa`)
-- **Dataset**: [SalomonMetre13/luo_swa_arXiv_2501.11003](https://huggingface.co/datasets/SalomonMetre13/luo_swa_arXiv_2501.11003)
-## Description
-This model is fine-tuned for translating text from Luo to Swahili using the NLLB-200 model architecture. The fine-tuning process involves extending the tokenizer's vocabulary with custom language tokens and training the model on a specific dataset.
-## Features
-- **Custom Tokenizer**: Extended with special tokens for Luo and Swahili.
-- **Training**: Fine-tuned on a dataset specifically curated for Luo-Swahili translation.
-- **Evaluation**: Uses BLEU score for performance evaluation.
-- **Inference**: Capable of translating new sentences and batches of text.
-## Usage
-### Installation
-Ensure you have the necessary libraries installed:
-```bash
-pip install datasets transformers sacrebleu huggingface_hub accelerate torch
-```
-### Fine-Tuning
-1. **Authentication**: Log in to Hugging Face to access the model and dataset.
-2. **Preprocessing**: The dataset is preprocessed to include special language tokens.
-3. **Training**: The model is fine-tuned using the `Seq2SeqTrainer` with specified training arguments.
-4. **Evaluation**: The model's performance is evaluated using the BLEU metric.
-### Inference
-#### Translate Single Sentence
-To translate a single sentence from Luo to Swahili, use the `translate_custom_sentence` function. Here's an example:
-```python
-from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
-# Load the fine-tuned model and tokenizer
-model_name = "SalomonMetre13/nllb-luo-swa-mt-v1"
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
-def translate_custom_sentence(src_text: str, src_lang: str = "luo", tgt_lang: str = "swa") -> str:
-    formatted_text = f"<{src_lang}> {src_text.strip()}"
-    # Tokenize input
-    inputs = tokenizer(
-        formatted_text,
-        return_tensors="pt",
-        max_length=128,
-        truncation=True
-    ).to(model.device)
-    # Generate translation
-    outputs = model.generate(
-        inputs.input_ids,
-        forced_bos_token_id=tokenizer.convert_tokens_to_ids(f"<{tgt_lang}>"),
-        max_length=150
-    )
-    # Decode and clean output
-    return tokenizer.decode(outputs[0], skip_special_tokens=True)
-# Example usage
-src_text = "Le feu crépite dans la nuit silencieuse."
-translation = translate_custom_sentence(src_text)
-print(f"Luo: {src_text}")
-print(f"Swahili: {translation}")
-```
-#### Translate Batch
-For batch translation, use the `translate_batch` function. This function processes multiple sentences at once, which can be more efficient for larger translation tasks.
-## Performance
-The model's performance is evaluated using the BLEU score on the test set. The BLEU score provides an indication of the translation quality.
-## Limitations
-- The model is trained with a maximum input length of 512 tokens, which may limit its effectiveness on longer texts.
-- The dataset used for fine-tuning may influence the model's performance on specific domains or styles of text.
-## Future Work
-- Explore fine-tuning on additional datasets to improve robustness.
-- Experiment with different training parameters and architectures to enhance performance.
-## Contact
-For questions or feedback, please contact [[email protected]].

 ---
+library_name: transformers
+license: cc-by-nc-4.0
+base_model: facebook/nllb-200-distilled-600M
+tags:
+- generated_from_trainer
+model-index:
+- name: nllb-luo-swa-mt-v1
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# nllb-luo-swa-mt-v1
+This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- eval_loss: 0.0895
+- eval_runtime: 178.9825
+- eval_samples_per_second: 16.354
+- eval_steps_per_second: 8.18
+- epoch: 0.4936
+- step: 6500
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 1
+- mixed_precision_training: Native AMP
+### Framework versions
+- Transformers 4.50.3
+- Pytorch 2.6.0+cu124
+- Datasets 3.5.0
+- Tokenizers 0.21.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 0,
+  "decoder_start_token_id": 2,
+  "eos_token_id": 2,
+  "max_length": 200,
+  "pad_token_id": 1,
+  "transformers_version": "4.50.3"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5f80afe2a4a689b253a41bcdc52e878cb3fbaefe1f1934db40a20678d3954c9b
 size 2460354912

 version https://git-lfs.github.com/spec/v1
+oid sha256:c6846789c56069fe8bc1dc0d95539b148f13ebce2581fe990d7829830922599a
 size 2460354912

runs/Apr12_11-39-52_495dcc39ad78/events.out.tfevents.1744458014.495dcc39ad78.3873.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:87c784916c19d5a9dfcbac8e5835165f75142b052e97ce8ca568d570e865bcfe
-size 22675

 version https://git-lfs.github.com/spec/v1
+oid sha256:8c8d952b9fa912e2c9f8b28b2353321432e61ca9e2554303377986433e62552a
+size 23730