You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

๐Ÿด๓ ง๓ ข๓ ท๓ ฌ๓ ณ๓ ฟ phi4-mini-welsh

Initial Welsh fine-tuned Phi-4-mini model trained on authentic Welsh datasets

This is a standalone merged model that includes Welsh language improvements and can be used directly.

Model Details

  • Base Model: microsoft/Phi-4-mini-instruct
  • Training Method: LoRA (Low-Rank Adaptation) using Unsloth
  • Language: Welsh (Cymraeg) with English support
  • Model Type: Merged Causal Language Model
  • Training Date: 2025-08-18 09:10:39 UTC
  • Welsh Tokens: Extended tokenizer with Welsh-specific tokens

Training Configuration

  • LoRA Rank: 64
  • LoRA Alpha: 32
  • Learning Rate: 0.0001
  • Batch Size: 2
  • Epochs: 3
  • Max Sequence Length: 2048

Training Data

This model was trained on the following Welsh language datasets:

  • Banc Trawsgrifiadau Bangor (techiaith/banc-trawsgrifiadau-bangor)
  • Common Voice Welsh 22.0 (techiaith/commonvoice_22_0_cy)
  • Welsh Legislation (techiaith/legislation-gov-uk_en-cy)
  • Welsh Wikipedia (1000 articles)

Total estimated tokens: 500K-800K authentic Welsh tokens

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model (standalone with Welsh tokens)
model = AutoModelForCausalLM.from_pretrained("DewiBrynJones/phi4-mini-welsh")
tokenizer = AutoTokenizer.from_pretrained("DewiBrynJones/phi4-mini-welsh")

# Generate Welsh text
prompt = "<welsh>Bore da, sut mae"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50, do_sample=True, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Model Capabilities

โœ… What the model can do:

  • Generate natural Welsh text
  • Understand Welsh sentence structure and mutations
  • Handle Welsh-English code-switching
  • Recognize Welsh idioms and expressions
  • Process both formal and colloquial Welsh

โš ๏ธ Limitations:

  • Still learning - may make grammatical errors
  • Limited to training data knowledge
  • May occasionally mix languages inappropriately

Training Infrastructure

  • Framework: Unsloth + PyTorch
  • Hardware: NVIDIA GPU with CUDA support
  • Optimization: 4-bit quantization for efficiency
  • Memory: Gradient checkpointing enabled

Ethics and Bias

This model has been trained on publicly available Welsh language datasets. It may reflect biases present in the training data. Users should be aware of potential limitations when using the model for sensitive applications.

Acknowledgments

License

This model is released under the MIT License. Please respect the original licenses of the training datasets.

Citation

If you use this model, please consider citing:

@misc{DewiBrynJones_phi4-mini-welsh,
  title={phi4-mini-welsh: Welsh Fine-tuned Phi-4-mini},
  author={DewiBrynJones},
  year={2025},
  url={https://huggingface.co/DewiBrynJones/phi4-mini-welsh}
}
Downloads last month
2
Safetensors
Model size
4B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DewiBrynJones/phi4-mini-welsh

Adapter
(119)
this model

Datasets used to train DewiBrynJones/phi4-mini-welsh