๐ด๓ ง๓ ข๓ ท๓ ฌ๓ ณ๓ ฟ phi4-mini-welsh
Initial Welsh fine-tuned Phi-4-mini model trained on authentic Welsh datasets
This is a standalone merged model that includes Welsh language improvements and can be used directly.
Model Details
- Base Model: microsoft/Phi-4-mini-instruct
- Training Method: LoRA (Low-Rank Adaptation) using Unsloth
- Language: Welsh (Cymraeg) with English support
- Model Type: Merged Causal Language Model
- Training Date: 2025-08-18 09:10:39 UTC
- Welsh Tokens: Extended tokenizer with Welsh-specific tokens
Training Configuration
- LoRA Rank: 64
- LoRA Alpha: 32
- Learning Rate: 0.0001
- Batch Size: 2
- Epochs: 3
- Max Sequence Length: 2048
Training Data
This model was trained on the following Welsh language datasets:
- Banc Trawsgrifiadau Bangor (techiaith/banc-trawsgrifiadau-bangor)
- Common Voice Welsh 22.0 (techiaith/commonvoice_22_0_cy)
- Welsh Legislation (techiaith/legislation-gov-uk_en-cy)
- Welsh Wikipedia (1000 articles)
Total estimated tokens: 500K-800K authentic Welsh tokens
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the model (standalone with Welsh tokens)
model = AutoModelForCausalLM.from_pretrained("DewiBrynJones/phi4-mini-welsh")
tokenizer = AutoTokenizer.from_pretrained("DewiBrynJones/phi4-mini-welsh")
# Generate Welsh text
prompt = "<welsh>Bore da, sut mae"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50, do_sample=True, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Model Capabilities
โ What the model can do:
- Generate natural Welsh text
- Understand Welsh sentence structure and mutations
- Handle Welsh-English code-switching
- Recognize Welsh idioms and expressions
- Process both formal and colloquial Welsh
โ ๏ธ Limitations:
- Still learning - may make grammatical errors
- Limited to training data knowledge
- May occasionally mix languages inappropriately
Training Infrastructure
- Framework: Unsloth + PyTorch
- Hardware: NVIDIA GPU with CUDA support
- Optimization: 4-bit quantization for efficiency
- Memory: Gradient checkpointing enabled
Ethics and Bias
This model has been trained on publicly available Welsh language datasets. It may reflect biases present in the training data. Users should be aware of potential limitations when using the model for sensitive applications.
Acknowledgments
- Techiaith (Bangor University) for Welsh language datasets
- Unsloth for efficient training framework
- Microsoft for the base Phi-4-mini model
- Welsh language community for data contributions
License
This model is released under the MIT License. Please respect the original licenses of the training datasets.
Citation
If you use this model, please consider citing:
@misc{DewiBrynJones_phi4-mini-welsh,
title={phi4-mini-welsh: Welsh Fine-tuned Phi-4-mini},
author={DewiBrynJones},
year={2025},
url={https://huggingface.co/DewiBrynJones/phi4-mini-welsh}
}
- Downloads last month
- 2
Model tree for DewiBrynJones/phi4-mini-welsh
Base model
microsoft/Phi-4-mini-instruct