---
library_name: transformers
license: other
license_name: lfm1.0
license_link: LICENSE
datasets:
- kurakurai/luth-sft
language:
- fr
- en
base_model:
- LiquidAI/LFM2-700M
pipeline_tag: text-generation
tags:
- liquid
- lfm2
- luth
---

# Luth-LFM2-700M
**Luth-LFM2-700M** is a French fine-tuned version of [LFM2-700M](https://huggingface.co/LiquidAI/LFM2-700M) in collaboration with Liquid AI, trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable.
Our Evaluation, training and data scripts are available on [GitHub](https://github.com/kurakurai/Luth), along with the [Blog](https://huggingface.co/blog/MaxLSB/luth) we wrote, to further detail our recipe.

## Model Details
The model was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged back with LFM2-700M. This process successfully retained the model's English capabilities while improving its performance in French.
## Benchmark Results
We used LightEval for evaluation, with custom tasks for the French benchmarks. The models were evaluated with a `temperature=0`.
### French Benchmark Scores
| Model | IFEval
French | GPQA-Diamond
French | MMLU
French | Math500
French | Arc-Challenge
French | Hellaswag
French |
| --------------------- | ------------- | ------------------- | ----------- | -------------- | -------------------- | ---------------- |
| **Luth-LFM2-700M** | 50.22 | 27.92 | 44.72| 38.40 | 36.70 | 48.25 |
| LFM2-700M | 41.96 | 20.81 | 43.70 | 32.40 | 36.27 | 41.51 |
| Llama-3.2-1B | 27.79 | 25.38 | 25.49 | 15.80 | 29.34 | 25.09 |
| Qwen3-0.6B | 44.86 | 26.90 | 27.13 | 29.20 | 31.57 | 25.10 |
| Qwen2.5-0.5B-Instruct | 22.00 | 25.89 | 35.04 | 12.00 | 28.23 | 51.45 |
### English Benchmark Scores
| Model | IFEval
English | GPQA-Diamond
English | MMLU
English | Math500
English | Arc-Challenge
English | Hellaswag
English |
| --------------------- | -------------- | -------------------- | ------------ | --------------- | --------------------- | ----------------- |
| **Luth-LFM2-700M** | 63.40 | 29.29 | 50.39 | 38.40 | 38.91 | 54.05 |
| LFM2-700M | 65.06 | 30.81 | 50.65 | 32.00 | 38.65 | 52.54 |
| Llama-3.2-1B | 44.05 | 25.25 | 31.02 | 26.40 | 34.30 | 55.84 |
| Qwen3-0.6B | 57.18 | 29.29 | 36.79 | 43.40 | 33.70 | 42.92 |
| Qwen2.5-0.5B-Instruct | 29.70 | 29.29 | 43.80 | 32.00 | 32.17 | 49.56 |
## Code Example
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("kurakurai/Luth-LFM2-700M")
model = AutoModelForCausalLM.from_pretrained("kurakurai/Luth-LFM2-700M")
messages = [
{"role": "user", "content": "Quelle est la capitale de la France?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(
tokenizer.decode(
outputs[0][inputs["input_ids"].shape[-1] :], skip_special_tokens=True
)
)
```
## Citation
```bibtex
@misc{luth2025kurakurai,
title = {Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer},
author = {Lasbordes, Maxence and Gad, Sinoué},
year = {2025},
howpublished = {\url{https://arxiv.org/abs/2510.05846}},
note = {arXiv:2510.05846}
}
```