|
--- |
|
base_model: |
|
- mistralai/Mistral-Nemo-Instruct-2407 |
|
language: |
|
- ku |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- mistral |
|
datasets: |
|
- nazimali/kurdish-wikipedia-articles |
|
library_name: transformers |
|
--- |
|
|
|
Continued pre-training on `mistralai/Mistral-Nemo-Instruct-2407` using the Kurdish wiki dataset with `unsloth`. |
|
This model should be further fine-tuned since the pre-training was to improve Kurdish language understanding. |
|
It's a quantized model using `bitsandbytes` so that it uses less memory. See [bitsandbytes documentation](https://huggingface.co/docs/transformers/main/en/quantization/bitsandbytes#bitsandbytes). |
|
|
|
There isn't a standard or even a good Kurdish metric to evaluate the model (that I could find). |
|
Will make it my next project to create an evaluation so that there's a reproducible baseline for Kurdish. |
|
|
|
|
|
Will look into a multi-GPU training setup so don't have to wait all day for results. Would like to train it with both Kurmanji and Sorani. |
|
|
|
|
|
### Use |
|
|
|
Should be fine-tuned further for a specific task. See instruction fine-tuned model [nazimali/Mistral-Nemo-Kurdish-Instruct](https://huggingface.co/nazimali/Mistral-Nemo-Kurdish-Instruct). |
|
|
|
### Training |
|
|
|
Transformers `4.44.2` |
|
1 NVIDIA A100 80GB PCIe |
|
Duration 6h 31m 4s |
|
|
|
```json |
|
{ |
|
"total_flos": 4121524790259794000, |
|
"train/epoch": 1, |
|
"train/global_step": 1960, |
|
"train/grad_norm": 3.1958093643188477, |
|
"train/learning_rate": 0, |
|
"train/loss": 1.2108, |
|
"train_loss": 1.256846008738693, |
|
"train_runtime": 23227.1752, |
|
"train_samples_per_second": 2.7, |
|
"train_steps_per_second": 0.084 |
|
} |
|
``` |
|
|
|
#### Pre-training data: |
|
|
|
- `nazimali/kurdish-wikipedia-articles` |
|
- Dataset number of rows: 63,076 |
|
- Filtered columns `title, text` |
|
- Must have at least 1 character |
|
- Number of rows used for training: 62,720 |
|
|
|
#### Training prompt format: |
|
|
|
```python |
|
training_prompt = """Gotara Wikipedia |
|
### Sernav: {} |
|
|
|
### Gotar: |
|
{}""" |