MaxLSB commited on
Commit
3348f85
·
verified ·
1 Parent(s): 8e5210a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -38,25 +38,25 @@ We used LightEval for evaluation, with custom tasks for the French benchmarks. T
38
 
39
  ### French Benchmark Scores
40
 
41
- | **Benchmark** | **LFM2-700M** | **Luth-LFM2-700M** |
42
- |-------------------|------------------|-----------------------|
43
- | **IFEval-fr (strict prompt)** | 41.04 | <u>51.76</u> |
44
- | **GPQA-fr** | <u>28.07</u> | 28.04 |
45
- | **MMLU-fr** | 43.73 | <u>44.72</u> |
46
- | **MATH-500-fr** | 33.60 | <u>36.60</u> |
47
- | **Arc-Chall-fr** | 36.27 | <u>36.70</u> |
48
- | **Hellaswag-fr** | 41.51 | <u>48.25</u> |
49
 
50
  ### English Benchmark Scores
51
 
52
- | **Benchmark** | **LFM2-700M** | **Luth-LFM2-700M** |
53
- |-------------------|------------------|-----------------------|
54
- | **IFEval-en (strict prompt)** | 64.14 | <u>64.70</u> |
55
- | **GPQA-en** | <u>28.20</u> | 25.88 |
56
- | **MMLU-en** | 50.73 | <u>50.92</u> |
57
- | **MATH-500-en** | 33.60 | <u>37.80</u> |
58
- | **Arc-Chall-en** | 38.48 | <u>38.91</u> |
59
- | **Hellaswag-en** | 52.67 | <u>54.08</u> |
60
 
61
  ## Code Example
62
 
 
38
 
39
  ### French Benchmark Scores
40
 
41
+ | Model | IFEval<br>French | GPQA-Diamond<br>French | MMLU<br>French | Math500<br>French | Arc-Challenge<br>French | Hellaswag<br>French |
42
+ | --------------------- | ------------- | ------------------- | ----------- | -------------- | -------------------- | ---------------- |
43
+ | **Luth-LFM2-700M** | <u>50.22</u> | <u>27.92</u> | <u>44.72</u>| <u>38.40</u> | <u>36.70</u> | 48.25 |
44
+ | LFM2-700M | 41.96 | 20.81 | 43.70 | 32.40 | 36.27 | 41.51 |
45
+ | Llama-3.2-1B | 27.79 | 25.38 | 25.49 | 15.80 | 29.34 | 25.09 |
46
+ | Qwen3-0.6B | 44.86 | 26.90 | 27.13 | 29.20 | 31.57 | 25.10 |
47
+ | Qwen2.5-0.5B-Instruct | 22.00 | 25.89 | 35.04 | 12.00 | 28.23 | <u>51.45</u> |
48
+
49
 
50
  ### English Benchmark Scores
51
 
52
+ | Model | IFEval<br>English | GPQA-Diamond<br>English | MMLU<br>English | Math500<br>English | Arc-Challenge<br>English | Hellaswag<br>English |
53
+ | --------------------- | -------------- | -------------------- | ------------ | --------------- | --------------------- | ----------------- |
54
+ | **Luth-LFM2-700M** | <u>63.40</u> | 29.29 | 50.39 | 38.40 | <u>38.91</u> | 54.05 |
55
+ | LFM2-700M | 65.06 | <u>30.81</u> | <u>50.65</u> | 32.00 | 38.65 | 52.54 |
56
+ | Llama-3.2-1B | 44.05 | 25.25 | 31.02 | 26.40 | 34.30 | <u>55.84</u> |
57
+ | Qwen3-0.6B | 57.18 | 29.29 | 36.79 | <u>43.40</u> | 33.70 | 42.92 |
58
+ | Qwen2.5-0.5B-Instruct | 29.70 | 29.29 | 43.80 | 32.00 | 32.17 | 49.56 |
59
+
60
 
61
  ## Code Example
62