emre commited on
Commit
bc27a98
·
verified ·
1 Parent(s): 1dd879b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -1
README.md CHANGED
@@ -6,9 +6,10 @@ tags:
6
  - unsloth
7
  - gemma3
8
  - trl
9
- license: apache-2.0
10
  language:
11
  - en
 
12
  ---
13
 
14
  # Uploaded model
@@ -20,3 +21,35 @@ language:
20
  This gemma3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - unsloth
7
  - gemma3
8
  - trl
9
+ license: afl-3.0
10
  language:
11
  - en
12
+ - tr
13
  ---
14
 
15
  # Uploaded model
 
21
  This gemma3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
22
 
23
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
24
+
25
+
26
+ ## Preliminary Evaluation Results / Leaderboard (Unofficial)
27
+
28
+ **English version is given below.**
29
+
30
+ Aşağıda, TARA v1 veri seti üzerinde değerlendirilen bazı modellerin ilk sonuçları gösterilmektedir. Bu sonuçlar, belirtilen değerlendirici model (`gemini-2-flash`) kullanılarak `success_rate (%)` metriğine göre hesaplanmıştır. Bu tablo resmi bir leaderboard değildir ancak modellerin farklı akıl yürütme alanlarındaki göreceli performansını göstermeyi amaçlamaktadır.
31
+
32
+ * **Değerlendirici Model:** `gemini-2-flash`
33
+ * **Metrik:** `success_rate (%)` (Başarı Oranı %)
34
+
35
+ | Model | Bilimsel (RAG) (%) | Etik (%) | Senaryo (%) | Yaratıcı (%) | Mantıksal (%) | Matematik (%) | Planlama (%) | Python (%) | SQL (%) | Tarihsel (RAG) (%) | Genel Başarı (%) |
36
+ | :------------------------------------------------------------------------------- | :----------------: | :------: | :---------: | :----------: | :-----------: | :-----------: | :----------: | :--------: | :-----: | :----------------: | :--------------: |
37
+ | [emre/gemma-3-4b-it-tr-reasoning40k](https://huggingface.co/emre/gemma-3-4b-it-tr-reasoning40k) | 73.64 | 62.73 | 60.91 | 48.18 | 60.00 | 38.18 | 51.82 | 35.45 | 41.82 | 75.45 | **54.82** |
38
+ | [unsloth/gemma-3-4b-it](https://huggingface.co/unsloth/gemma-3-4b-it) | 62.73 | 74.55 | 88.18 | 58.18 | 71.82 | 59.09 | 41.82 | 70.91 | 41.82 | 95.45 | **66.45** |
39
+ | [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) | 63.64 | 46.36 | 47.27 | 40.00 | 54.55 | 27.27 | 17.27 | 33.64 | 30.00 | 53.64 | **41.36** |
40
+ | [emre/gemma-7b-it-Turkish-Reasoning-FT-smol](https://huggingface.co/emre/gemma-7b-it-Turkish-Reasoning-FT-smol) | 52.73 | 42.73 | 45.45 | 21.82 | 39.09 | 33.64 | 28.18 | 30.00 | 30.00 | 60.91 | **38.45** |
41
+ | [emre/gemma-3-12b-it-tr-reasoning40k](https://huggingface.co/emre/gemma-3-12b-it-tr-reasoning40k) | 92.73 | 70.91 | 86.36 | 62.73 | 71.82 | 83.64 | 60.00 | 92.73 | 55.45 | 79.09 | **75.55** |
42
+ | [unsloth/gemma-3-12b-it-tr](https://huggingface.co/unsloth/gemma-3-12b-it) | 85.45 | 93.64 | 93.64 | 68.18 | 77.27 | 62.73 | 53.64 | 86.36 | 61.82 | 95.45 | **77.82** |
43
+ | [emre/gemma-3-12b-ft-tr-reasoning40k](https://huggingface.co/emre/gemma-3-12b-ft-tr-reasoning40k) | 86.36 | 68.18 | 77.27 | 54.55 | 47.27 | 50.91 | 43.64 | 59.09 | 23.64 | 85.55 | **59.55** |
44
+ | [emre/gemma-3-27b-it-tr-reasoning40k-4bit](https://huggingface.co/emre/gemma-3-27b-it-tr-reasoning40k-4bit) | 93.64 | 95.45 | 97.27 | 65.45 | 77.27 | 82.73 | 71.82 | 92.73 | 75.45 | 95.45 | **84.73** |
45
+ | [unsloth/gemma-3-27b-it-unsloth-bnb-4bit](https://huggingface.co/unsloth/gemma-3-27b-it-unsloth-bnb-4bit) | 86.36 | 71.82 | 96.36 | 59.09 | 81.82 | 76.36 | 66.36 | 93.64 | 69.09 | 99.09 | **80.00** |
46
+ | [TURKCELL/Turkcell-LLM-7b-v1](https://huggingface.co/TURKCELL/Turkcell-LLM-7b-v1)| 50.91 | 49.09 | 31.82 | 12.73 | 43.73 | 14.55 | 15.45 | 20.00 | 0.91 | 75.45 | **31.36** |
47
+ | [google/gemini-1.5-flash](https://ai.google.dev/gemini-api/docs/models?hl=en#model-versions) | 100.00 | 90.91 | 100.00 | 77.27 | 100.00 | 63.64 | 71.82 | 92.73 | 85.45 | 100.00 | **88.18** |
48
+ | [google/gemini-2.0-flash-lite](https://ai.google.dev/gemini-api/docs/models?hl=en#model-versions) | 95.45 | 100.00 | 100.00 | 79.09 | 100.00 | 85.45 | 80.91 | 92.73 | 90.91 | 97.27 | **92.18** |
49
+ | [Trendyol/Trendyol-LLM-7B-chat-v4.1.0](https://huggingface.co/Trendyol/Trendyol-LLM-7B-chat-v4.1.0) | 84.55 | 71.82 | 68.18 | 54.55 | 70.91 | 60.00 | 46.36 | 80.00 | 46.36 | 81.82 | **66.46** |
50
+ | [Openai/gpt-4o-mini-2024-07-18](https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/) | 93.64 | 87.27 | 100.00 | 75.45 | 82.73 | 75.45 | 71.82 | 92.73 | 76.36 | 100.00 | **85.55** |
51
+ | [Openai/o3-mini-2025-01-31](https://openai.com/index/openai-o3-mini/) | 100.00 | 93.64 | 100.00 | 92.73 | 100.00 | 100.00 | 85.45 | 88.18 | 100.00 | 100.00 | **96.00** |
52
+ | [neuralwork/gemma-2-9b-it-tr](https://huggingface.co/neuralwork/gemma-2-9b-it-tr) | 94.55 | 81.82 | 91.82 | 91.82 | 79.09 | 58.18 | 46.36 | 61.82 | 49.09 | 96.36 | **75.09** |
53
+ | [Openai/gpt-4.1-nano-2025-04-14](https://openai.com/index/gpt-4-1/) | 100.00 | 95.45 | 82.73 | 91.82 | 82.73 | 69.09 | 71.82 | 86.36 | 75.45 | 100.00 | **85.55** |
54
+ | [Openai/gpt-4o-2024-08-06](https://openai.com/index/gpt-4o-system-card/) | 89.09 | 80.91 | 90.91 | 91.82 | 91.82 | 92.73 | 71.82 | 92.73 | 70.00 | 100.00 | **87.18** |
55
+ | [Openai/gpt-4.1-mini-2025-04-14](https://openai.com/index/gpt-4-1/) | 100.00 | 100.00 | 100.00 | 92.73 | 91.82 | 100.00 | 84.55 | 100.00 | 100.00 | 100.00 | **96.91** |