metadata

datasets:
  - atlasia/darija_english

Darija-English Translator

This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct on the darija_finetune_train dataset. It is designed to translate text from Moroccan Darija (a dialect of Arabic) to English.

Model Details

Library: PEFT
License: Apache 2.0
Base Model: Qwen/Qwen2.5-1.5B-Instruct
Tags: llama-factory, lora, generated_from_trainer

How to Use

You can load and use the model with the transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Define model and tokenizer
base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
finetuned_model_id = "ELhadratiOth/darija-english-translater"
device = "cuda"  # Change to "cpu" if GPU is not available

model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",
    torch_dtype=None
)

# Load the fine-tuned adapter
model.load_adapter(finetuned_model_id)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

def translate_darija(text):
    messages = [
        {"role": "system", "content": "You are a professional NLP data parser. Follow the provided task and output scheme for consistency."},
        {"role": "user", "content": f"## Task:\n{text}\n\n## English Translation:"}
    ]
    
    text_input = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    model_inputs = tokenizer([text_input], return_tensors="pt").to(device)
    
    generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=1024, do_sample=False, temperature=0.8)
    translation = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    
    return translation

# Example usage
query = "Your Darija text here"
response = translate_darija(query)
print(response)

Training Details

Hyperparameters

Learning Rate: 0.0001
Batch Size:
- Train: 1
- Eval: 1
Seed: 42
Distributed Training: Multi-GPU
Number of Devices: 2
Gradient Accumulation Steps: 4
Total Train Batch Size: 8
Total Eval Batch Size: 2
Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
LR Scheduler: Cosine
Warmup Ratio: 0.1
Epochs: 10

Framework Versions

PEFT: 0.12.0
Transformers: 4.49.0
PyTorch: 2.5.1+cu121
Datasets: 3.2.0
Tokenizers: 0.21.0