---
library_name: transformers
license: other
license_name: lfm1.0
license_link: LICENSE
datasets:
- kurakurai/luth-sft
language:
- fr
- en
base_model:
- LiquidAI/LFM2-700M
pipeline_tag: text-generation
tags:
- liquid
- lfm2
- luth
---

![Luth x LFM2](media/logo_collab.png)


# Luth-LFM2-700M

**Luth-LFM2-700M** is a French fine-tuned version of [LFM2-700M](https://huggingface.co/LiquidAI/LFM2-700M) in collaboration with Liquid AI, trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable.

Our Evaluation, training and data scripts are available on [GitHub](https://github.com/kurakurai/Luth), along with the [Blog](https://huggingface.co/blog/MaxLSB/luth) we wrote, to further detail our recipe.

![Luth-LFM2 graph](media/lfm2-luth.png)

## Model Details

The model was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged back with LFM2-700M. This process successfully retained the model's English capabilities while improving its performance in French.

## Benchmark Results

We used LightEval for evaluation, with custom tasks for the French benchmarks. The models were evaluated with a `temperature=0`.

### French Benchmark Scores

| Model                 | IFEval<br>French | GPQA-Diamond<br>French | MMLU<br>French | Math500<br>French | Arc-Challenge<br>French | Hellaswag<br>French |
| --------------------- | ------------- | ------------------- | ----------- | -------------- | -------------------- | ---------------- |
| **Luth-LFM2-700M**    | <u>50.22</u>   | <u>27.92</u>       | <u>44.72</u>| <u>38.40</u>   | <u>36.70</u>         | 48.25            |
| LFM2-700M             | 41.96         | 20.81               | 43.70       | 32.40          | 36.27                | 41.51            |
| Llama-3.2-1B          | 27.79         | 25.38               | 25.49       | 15.80          | 29.34                | 25.09            |
| Qwen3-0.6B            | 44.86         | 26.90               | 27.13       | 29.20          | 31.57                | 25.10            |
| Qwen2.5-0.5B-Instruct | 22.00         | 25.89               | 35.04       | 12.00          | 28.23                | <u>51.45</u>     |


### English Benchmark Scores

| Model                 | IFEval<br>English | GPQA-Diamond<br>English | MMLU<br>English | Math500<br>English | Arc-Challenge<br>English | Hellaswag<br>English |
| --------------------- | -------------- | -------------------- | ------------ | --------------- | --------------------- | ----------------- |
| **Luth-LFM2-700M**    | 63.40          | 29.29                | 50.39        | 38.40           | <u>38.91</u>          | 54.05             |
| LFM2-700M             | <u>65.06</u>   | <u>30.81</u>         | <u>50.65</u> | 32.00           | 38.65                 | 52.54             |
| Llama-3.2-1B          | 44.05          | 25.25                | 31.02        | 26.40           | 34.30                 | <u>55.84</u>      |
| Qwen3-0.6B            | 57.18          | 29.29                | 36.79        | <u>43.40</u>    | 33.70                 | 42.92             |
| Qwen2.5-0.5B-Instruct | 29.70          | 29.29                | 43.80        | 32.00           | 32.17                 | 49.56             |


## Code Example

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("kurakurai/Luth-LFM2-700M")
model = AutoModelForCausalLM.from_pretrained("kurakurai/Luth-LFM2-700M")
messages = [
    {"role": "user", "content": "Quelle est la capitale de la France?"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=100)
print(
    tokenizer.decode(
        outputs[0][inputs["input_ids"].shape[-1] :], skip_special_tokens=True
    )
)
```

## Citation

```bibtex
@misc{luth2025kurakurai,
  title        = {Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer},
  author       = {Lasbordes, Maxence and Gad, Sinoué},
  year         = {2025},
  howpublished = {\url{https://arxiv.org/abs/2510.05846}},
  note         = {arXiv:2510.05846}
}
```