File size: 2,445 Bytes
526f830 0bcc47a 526f830 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
---
datasets:
- atlasia/darija_english
---
# Darija-English Translator
This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) on the `darija_finetune_train` dataset. It is designed to translate text from Moroccan Darija (a dialect of Arabic) to English.
## Model Details
- **Library**: PEFT
- **License**: Apache 2.0
- **Base Model**: Qwen/Qwen2.5-1.5B-Instruct
- **Tags**: `llama-factory`, `lora`, `generated_from_trainer`
## How to Use
You can load and use the model with the `transformers` library:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Define model and tokenizer
base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
finetuned_model_id = "ELhadratiOth/darija-english-translater"
device = "cuda" # Change to "cpu" if GPU is not available
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
device_map="auto",
torch_dtype=None
)
# Load the fine-tuned adapter
model.load_adapter(finetuned_model_id)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
def translate_darija(text):
messages = [
{"role": "system", "content": "You are a professional NLP data parser. Follow the provided task and output scheme for consistency."},
{"role": "user", "content": f"## Task:\n{text}\n\n## English Translation:"}
]
text_input = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text_input], return_tensors="pt").to(device)
generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=1024, do_sample=False, temperature=0.8)
translation = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
return translation
# Example usage
query = "Your Darija text here"
response = translate_darija(query)
print(response)
```
## Training Details
### Hyperparameters
- **Learning Rate**: 0.0001
- **Batch Size**:
- Train: 1
- Eval: 1
- **Seed**: 42
- **Distributed Training**: Multi-GPU
- **Number of Devices**: 2
- **Gradient Accumulation Steps**: 4
- **Total Train Batch Size**: 8
- **Total Eval Batch Size**: 2
- **Optimizer**: AdamW (betas=(0.9,0.999), epsilon=1e-08)
- **LR Scheduler**: Cosine
- **Warmup Ratio**: 0.1
- **Epochs**: 10
### Framework Versions
- PEFT: 0.12.0
- Transformers: 4.49.0
- PyTorch: 2.5.1+cu121
- Datasets: 3.2.0
- Tokenizers: 0.21.0 |