File size: 3,464 Bytes
			
			| 815d1ea 634d9e3 a6dcdc2 634d9e3 815d1ea a6dcdc2 815d1ea a6dcdc2 815d1ea a6dcdc2 815d1ea a6dcdc2 815d1ea a6dcdc2 815d1ea a6dcdc2 cfe2c8b a6dcdc2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | ---
library_name: transformers
license: other
license_name: lfm1.0
license_link: LICENSE
datasets:
- kurakurai/luth-sft
language:
- fr
- en
base_model:
- LiquidAI/LFM2-700M
pipeline_tag: text-generation
tags:
- liquid
- lfm2
- luth
---

# Luth-LFM2-700M
**Luth-LFM2-700M** is a French fine-tuned version of [LFM2-700M](https://huggingface.co/LiquidAI/LFM2-350M), trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable.
Our Evaluation, training and data scripts are available on [GitHub](https://github.com/kurakurai/Luth), along with the [Blog](https://huggingface.co/blog/MaxLSB/luth) we wrote, to further detail our recipe.
## Model Details
The model was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged back with LFM2-700M. This process successfully retained the model's English capabilities while improving its performance in French.
## Benchmark Results
We used LightEval for evaluation, with custom tasks for the French benchmarks. The models were evaluated with a `temperature=0`.
### French Benchmark Scores
| Benchmark         | LFM2-700M        | Luth-LFM2-700M     |
|-------------------|------------------|-----------------------|
| ifeval-fr         | 42.33            | <u>50.65</u>                 |
| gpqa-fr           | 26.55            | <u>28.04</u>                 |
| mmlu-fr           | 43.69            | <u>44.71</u>                 |
| math-500-fr       | 32.40            | <u>36.60</u>                 |
| kholle            | 39.43            | <u>43.33</u>                 |
| arc-chall-fr      | 36.18            | <u>36.70</u>                 |
| hellaswag-fr      | 41.51            | <u>48.25</u>          |
### English Benchmark Scores
| Benchmark         | LFM2-700M        | Luth-LFM2-700M     |
|-------------------|------------------|-----------------------|
| ifeval-en         | <u>65.80</u>     | 62.48                 |
| gpqa-en           | <u>26.98</u>     | 23.17                 |
| mmlu-en           | <u>50.74</u>     | 50.45                 |
| math-500-en       | 34.00            | <u>40.40</u>          |
| arc-chall-en      | 38.57            | <u>39.25</u>          |
| hellaswag-en      | 52.63            | <u>54.07</u>          |
## Code Example
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("kurakurai/Luth-LFM2-700M")
model = AutoModelForCausalLM.from_pretrained("kurakurai/Luth-LFM2-700M")
messages = [
    {"role": "user", "content": "Quelle est la capitale de la France?"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(
    tokenizer.decode(
        outputs[0][inputs["input_ids"].shape[-1] :], skip_special_tokens=True
    )
)
```
## Citation
```bibtex
@misc{luthlfm2kurakurai,
  title   = {Luth-LFM2-700M},
  author  = {Kurakura AI Team},
  year    = {2025},
  howpublished = {\url{https://huggingface.co/kurakurai/Luth-LFM2-700M}},
  note    = {LFM2-700M fine-tuned on French datasets}
}
``` | 
