Namitoo commited on
Commit
7593f3e
1 Parent(s): f82a1ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +233 -0
README.md CHANGED
@@ -2,4 +2,237 @@
2
  metrics:
3
  - bertscore
4
  - accuracy
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ---
 
2
  metrics:
3
  - bertscore
4
  - accuracy
5
+ model-index:
6
+ - name: StarCoder
7
+ results:
8
+ - task:
9
+ type: text-generation
10
+ dataset:
11
+ type: openai_humaneval
12
+ name: HumanEval (Prompted)
13
+ metrics:
14
+ - name: pass@1
15
+ type: pass@1
16
+ value: 0.408
17
+ verified: false
18
+ - task:
19
+ type: text-generation
20
+ dataset:
21
+ type: openai_humaneval
22
+ name: HumanEval
23
+ metrics:
24
+ - name: pass@1
25
+ type: pass@1
26
+ value: 0.336
27
+ verified: false
28
+ - task:
29
+ type: text-generation
30
+ dataset:
31
+ type: mbpp
32
+ name: MBPP
33
+ metrics:
34
+ - name: pass@1
35
+ type: pass@1
36
+ value: 0.527
37
+ verified: false
38
+ - task:
39
+ type: text-generation
40
+ dataset:
41
+ type: ds1000
42
+ name: DS-1000 (Overall Completion)
43
+ metrics:
44
+ - name: pass@1
45
+ type: pass@1
46
+ value: 0.26
47
+ verified: false
48
+ - task:
49
+ type: text-generation
50
+ dataset:
51
+ type: nuprl/MultiPL-E
52
+ name: MultiPL-HumanEval (C++)
53
+ metrics:
54
+ - name: pass@1
55
+ type: pass@1
56
+ value: 0.3155
57
+ verified: false
58
+ - task:
59
+ type: text-generation
60
+ dataset:
61
+ type: nuprl/MultiPL-E
62
+ name: MultiPL-HumanEval (C#)
63
+ metrics:
64
+ - name: pass@1
65
+ type: pass@1
66
+ value: 0.2101
67
+ verified: false
68
+ - task:
69
+ type: text-generation
70
+ dataset:
71
+ type: nuprl/MultiPL-E
72
+ name: MultiPL-HumanEval (D)
73
+ metrics:
74
+ - name: pass@1
75
+ type: pass@1
76
+ value: 0.1357
77
+ verified: false
78
+ - task:
79
+ type: text-generation
80
+ dataset:
81
+ type: nuprl/MultiPL-E
82
+ name: MultiPL-HumanEval (Go)
83
+ metrics:
84
+ - name: pass@1
85
+ type: pass@1
86
+ value: 0.1761
87
+ verified: false
88
+ - task:
89
+ type: text-generation
90
+ dataset:
91
+ type: nuprl/MultiPL-E
92
+ name: MultiPL-HumanEval (Java)
93
+ metrics:
94
+ - name: pass@1
95
+ type: pass@1
96
+ value: 0.3022
97
+ verified: false
98
+ - task:
99
+ type: text-generation
100
+ dataset:
101
+ type: nuprl/MultiPL-E
102
+ name: MultiPL-HumanEval (Julia)
103
+ metrics:
104
+ - name: pass@1
105
+ type: pass@1
106
+ value: 0.2302
107
+ verified: false
108
+ - task:
109
+ type: text-generation
110
+ dataset:
111
+ type: nuprl/MultiPL-E
112
+ name: MultiPL-HumanEval (JavaScript)
113
+ metrics:
114
+ - name: pass@1
115
+ type: pass@1
116
+ value: 0.3079
117
+ verified: false
118
+ - task:
119
+ type: text-generation
120
+ dataset:
121
+ type: nuprl/MultiPL-E
122
+ name: MultiPL-HumanEval (Lua)
123
+ metrics:
124
+ - name: pass@1
125
+ type: pass@1
126
+ value: 0.2389
127
+ verified: false
128
+ - task:
129
+ type: text-generation
130
+ dataset:
131
+ type: nuprl/MultiPL-E
132
+ name: MultiPL-HumanEval (PHP)
133
+ metrics:
134
+ - name: pass@1
135
+ type: pass@1
136
+ value: 0.2608
137
+ verified: false
138
+ - task:
139
+ type: text-generation
140
+ dataset:
141
+ type: nuprl/MultiPL-E
142
+ name: MultiPL-HumanEval (Perl)
143
+ metrics:
144
+ - name: pass@1
145
+ type: pass@1
146
+ value: 0.1734
147
+ verified: false
148
+ - task:
149
+ type: text-generation
150
+ dataset:
151
+ type: nuprl/MultiPL-E
152
+ name: MultiPL-HumanEval (Python)
153
+ metrics:
154
+ - name: pass@1
155
+ type: pass@1
156
+ value: 0.3357
157
+ verified: false
158
+ - task:
159
+ type: text-generation
160
+ dataset:
161
+ type: nuprl/MultiPL-E
162
+ name: MultiPL-HumanEval (R)
163
+ metrics:
164
+ - name: pass@1
165
+ type: pass@1
166
+ value: 0.155
167
+ verified: false
168
+ - task:
169
+ type: text-generation
170
+ dataset:
171
+ type: nuprl/MultiPL-E
172
+ name: MultiPL-HumanEval (Ruby)
173
+ metrics:
174
+ - name: pass@1
175
+ type: pass@1
176
+ value: 0.0124
177
+ verified: false
178
+ - task:
179
+ type: text-generation
180
+ dataset:
181
+ type: nuprl/MultiPL-E
182
+ name: MultiPL-HumanEval (Racket)
183
+ metrics:
184
+ - name: pass@1
185
+ type: pass@1
186
+ value: 0.0007
187
+ verified: false
188
+ - task:
189
+ type: text-generation
190
+ dataset:
191
+ type: nuprl/MultiPL-E
192
+ name: MultiPL-HumanEval (Rust)
193
+ metrics:
194
+ - name: pass@1
195
+ type: pass@1
196
+ value: 0.2184
197
+ verified: false
198
+ - task:
199
+ type: text-generation
200
+ dataset:
201
+ type: nuprl/MultiPL-E
202
+ name: MultiPL-HumanEval (Scala)
203
+ metrics:
204
+ - name: pass@1
205
+ type: pass@1
206
+ value: 0.2761
207
+ verified: false
208
+ - task:
209
+ type: text-generation
210
+ dataset:
211
+ type: nuprl/MultiPL-E
212
+ name: MultiPL-HumanEval (Bash)
213
+ metrics:
214
+ - name: pass@1
215
+ type: pass@1
216
+ value: 0.1046
217
+ verified: false
218
+ - task:
219
+ type: text-generation
220
+ dataset:
221
+ type: nuprl/MultiPL-E
222
+ name: MultiPL-HumanEval (Swift)
223
+ metrics:
224
+ - name: pass@1
225
+ type: pass@1
226
+ value: 0.2274
227
+ verified: false
228
+ - task:
229
+ type: text-generation
230
+ dataset:
231
+ type: nuprl/MultiPL-E
232
+ name: MultiPL-HumanEval (TypeScript)
233
+ metrics:
234
+ - name: pass@1
235
+ type: pass@1
236
+ value: 0.3229
237
+ verified: false
238
  ---