dustalov commited on
Commit
b47cb9c
·
verified ·
1 Parent(s): c516a59

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -3
README.md CHANGED
@@ -75,6 +75,70 @@ model-index:
75
  type: exact_match
76
  value: 0.2797
77
  verified: false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  - task:
79
  type: text-generation
80
  dataset:
@@ -156,9 +220,9 @@ In addition to the base model scores, we are providing scores for a Mellum fine-
156
  | Mellum-4b-base | 28.20% | 27.95% | 27.77% | 24.53% | 21.10% | 25.91% | 27.97% |
157
 
158
  ### Java Subset
159
- | Model | 2K Context | 4K Context | 8K Context |
160
- |---------------|------------|------------|------------|
161
- | Mellum-4b-base | 33.15% | 33.48% | 27.79% |
162
 
163
  ## Syntax-Aware Fill-in-the-Middle (SAFIM)
164
  - Type: mix of multi-line and single-line
 
75
  type: exact_match
76
  value: 0.2797
77
  verified: false
78
+ - task:
79
+ type: text-generation
80
+ dataset:
81
+ type: tianyang/repobench_java_v1.1
82
+ name: RepoBench 1.1 (Java, 2k)
83
+ metrics:
84
+ - name: EM
85
+ type: exact_match
86
+ value: 0.3202
87
+ verified: false
88
+ - task:
89
+ type: text-generation
90
+ dataset:
91
+ type: tianyang/repobench_java_v1.1
92
+ name: RepoBench 1.1 (Java, 4k)
93
+ metrics:
94
+ - name: EM
95
+ type: exact_match
96
+ value: 0.3212
97
+ verified: false
98
+ - task:
99
+ type: text-generation
100
+ dataset:
101
+ type: tianyang/repobench_java_v1.1
102
+ name: RepoBench 1.1 (Java, 8k)
103
+ metrics:
104
+ - name: EM
105
+ type: exact_match
106
+ value: 0.2910
107
+ verified: false
108
+ - task:
109
+ type: text-generation
110
+ dataset:
111
+ type: tianyang/repobench_java_v1.1
112
+ name: RepoBench 1.1 (Java, 12k)
113
+ metrics:
114
+ - name: EM
115
+ type: exact_match
116
+ value: 0.2492
117
+ verified: false
118
+ - task:
119
+ type: text-generation
120
+ dataset:
121
+ type: tianyang/repobench_java_v1.1
122
+ name: RepoBench 1.1 (Java, 16k)
123
+ metrics:
124
+ - name: EM
125
+ type: exact_match
126
+ value: 0.2474
127
+ verified: false
128
+ - task:
129
+ type: text-generation
130
+ dataset:
131
+ type: tianyang/repobench_java_v1.1
132
+ name: RepoBench 1.1 (Java)
133
+ metrics:
134
+ - name: EM
135
+ type: exact_match
136
+ value: 0.2858
137
+ verified: false
138
+ - name: EM ≤ 8k
139
+ type: exact_match
140
+ value: 0.3108
141
+ verified: false
142
  - task:
143
  type: text-generation
144
  dataset:
 
220
  | Mellum-4b-base | 28.20% | 27.95% | 27.77% | 24.53% | 21.10% | 25.91% | 27.97% |
221
 
222
  ### Java Subset
223
+ | Model | 2k | 4k | 8k | 12k | 16k | Avg | Avg 8k |
224
+ |----------------|--------|--------|--------|--------|--------|--------|----------|
225
+ | Mellum-4b-base | 32.02% | 32.12% | 29.10% | 24.92% | 24.74% | 28.58% | 31.08% |
226
 
227
  ## Syntax-Aware Fill-in-the-Middle (SAFIM)
228
  - Type: mix of multi-line and single-line