--- datasets: - atlasia/darija_english --- # Darija-English Translator This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) on the `darija_finetune_train` dataset. It is designed to translate text from Moroccan Darija (a dialect of Arabic) to English. ## Model Details - **Library**: PEFT - **License**: Apache 2.0 - **Base Model**: Qwen/Qwen2.5-1.5B-Instruct - **Tags**: `llama-factory`, `lora`, `generated_from_trainer` ## How to Use You can load and use the model with the `transformers` library: ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Define model and tokenizer base_model_id = "Qwen/Qwen2.5-1.5B-Instruct" finetuned_model_id = "ELhadratiOth/darija-english-translater" device = "cuda" # Change to "cpu" if GPU is not available model = AutoModelForCausalLM.from_pretrained( base_model_id, device_map="auto", torch_dtype=None ) # Load the fine-tuned adapter model.load_adapter(finetuned_model_id) # Load tokenizer tokenizer = AutoTokenizer.from_pretrained(base_model_id) def translate_darija(text): messages = [ {"role": "system", "content": "You are a professional NLP data parser. Follow the provided task and output scheme for consistency."}, {"role": "user", "content": f"## Task:\n{text}\n\n## English Translation:"} ] text_input = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) model_inputs = tokenizer([text_input], return_tensors="pt").to(device) generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=1024, do_sample=False, temperature=0.8) translation = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] return translation # Example usage query = "Your Darija text here" response = translate_darija(query) print(response) ``` ## Training Details ### Hyperparameters - **Learning Rate**: 0.0001 - **Batch Size**: - Train: 1 - Eval: 1 - **Seed**: 42 - **Distributed Training**: Multi-GPU - **Number of Devices**: 2 - **Gradient Accumulation Steps**: 4 - **Total Train Batch Size**: 8 - **Total Eval Batch Size**: 2 - **Optimizer**: AdamW (betas=(0.9,0.999), epsilon=1e-08) - **LR Scheduler**: Cosine - **Warmup Ratio**: 0.1 - **Epochs**: 10 ### Framework Versions - PEFT: 0.12.0 - Transformers: 4.49.0 - PyTorch: 2.5.1+cu121 - Datasets: 3.2.0 - Tokenizers: 0.21.0