Files changed (1) hide show
  1. README.md +115 -7
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  language:
3
  - en
4
- pipeline_tag: text-generation
5
  tags:
6
  - chat
7
  - llama
@@ -9,14 +9,109 @@ tags:
9
  - llaam3
10
  - finetune
11
  - chatml
12
- library_name: transformers
13
- inference: false
14
- model_creator: MaziyarPanahi
15
- quantized_by: MaziyarPanahi
16
  base_model: meta-llama/Meta-Llama-3.1-70B-Instruct
17
- model_name: calme-2.2-llama3.1-70b
18
  datasets:
19
  - MaziyarPanahi/truthy-dpo-v0.1-axolotl
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ---
21
 
22
  <img src="./calme-2.webp" alt="Calme-2 Models" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
@@ -83,4 +178,17 @@ model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-2.2-llama3.1-7
83
 
84
  # Ethical Considerations
85
 
86
- As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
  - en
4
+ library_name: transformers
5
  tags:
6
  - chat
7
  - llama
 
9
  - llaam3
10
  - finetune
11
  - chatml
 
 
 
 
12
  base_model: meta-llama/Meta-Llama-3.1-70B-Instruct
 
13
  datasets:
14
  - MaziyarPanahi/truthy-dpo-v0.1-axolotl
15
+ model_name: calme-2.2-llama3.1-70b
16
+ pipeline_tag: text-generation
17
+ inference: false
18
+ model_creator: MaziyarPanahi
19
+ quantized_by: MaziyarPanahi
20
+ model-index:
21
+ - name: calme-2.2-llama3.1-70b
22
+ results:
23
+ - task:
24
+ type: text-generation
25
+ name: Text Generation
26
+ dataset:
27
+ name: IFEval (0-Shot)
28
+ type: HuggingFaceH4/ifeval
29
+ args:
30
+ num_few_shot: 0
31
+ metrics:
32
+ - type: inst_level_strict_acc and prompt_level_strict_acc
33
+ value: 85.93
34
+ name: strict accuracy
35
+ source:
36
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.2-llama3.1-70b
37
+ name: Open LLM Leaderboard
38
+ - task:
39
+ type: text-generation
40
+ name: Text Generation
41
+ dataset:
42
+ name: BBH (3-Shot)
43
+ type: BBH
44
+ args:
45
+ num_few_shot: 3
46
+ metrics:
47
+ - type: acc_norm
48
+ value: 54.21
49
+ name: normalized accuracy
50
+ source:
51
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.2-llama3.1-70b
52
+ name: Open LLM Leaderboard
53
+ - task:
54
+ type: text-generation
55
+ name: Text Generation
56
+ dataset:
57
+ name: MATH Lvl 5 (4-Shot)
58
+ type: hendrycks/competition_math
59
+ args:
60
+ num_few_shot: 4
61
+ metrics:
62
+ - type: exact_match
63
+ value: 2.11
64
+ name: exact match
65
+ source:
66
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.2-llama3.1-70b
67
+ name: Open LLM Leaderboard
68
+ - task:
69
+ type: text-generation
70
+ name: Text Generation
71
+ dataset:
72
+ name: GPQA (0-shot)
73
+ type: Idavidrein/gpqa
74
+ args:
75
+ num_few_shot: 0
76
+ metrics:
77
+ - type: acc_norm
78
+ value: 9.96
79
+ name: acc_norm
80
+ source:
81
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.2-llama3.1-70b
82
+ name: Open LLM Leaderboard
83
+ - task:
84
+ type: text-generation
85
+ name: Text Generation
86
+ dataset:
87
+ name: MuSR (0-shot)
88
+ type: TAUR-Lab/MuSR
89
+ args:
90
+ num_few_shot: 0
91
+ metrics:
92
+ - type: acc_norm
93
+ value: 17.07
94
+ name: acc_norm
95
+ source:
96
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.2-llama3.1-70b
97
+ name: Open LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: MMLU-PRO (5-shot)
103
+ type: TIGER-Lab/MMLU-Pro
104
+ config: main
105
+ split: test
106
+ args:
107
+ num_few_shot: 5
108
+ metrics:
109
+ - type: acc
110
+ value: 49.05
111
+ name: accuracy
112
+ source:
113
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.2-llama3.1-70b
114
+ name: Open LLM Leaderboard
115
  ---
116
 
117
  <img src="./calme-2.webp" alt="Calme-2 Models" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
 
178
 
179
  # Ethical Considerations
180
 
181
+ As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
182
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
183
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-2.2-llama3.1-70b)
184
+
185
+ | Metric |Value|
186
+ |-------------------|----:|
187
+ |Avg. |36.39|
188
+ |IFEval (0-Shot) |85.93|
189
+ |BBH (3-Shot) |54.21|
190
+ |MATH Lvl 5 (4-Shot)| 2.11|
191
+ |GPQA (0-shot) | 9.96|
192
+ |MuSR (0-shot) |17.07|
193
+ |MMLU-PRO (5-shot) |49.05|
194
+