topshik commited on
Commit
99a3423
·
verified ·
1 Parent(s): 15315f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -267,8 +267,6 @@ Since Mellum has a maximum context window of 8k, we report here both the average
267
  | Mellum-4b-sft-python | 33.16% | 36.11% | 57.10% | 42.12% |
268
  | Mellum-4b-base | 25.30% | 38.39% | 50.65% | 38.11% |
269
 
270
- We are investigating potential issues with SAFIM evaluation.
271
-
272
  ## HumanEval Infilling
273
  - Type: single-line and multi-line
274
  - Languages: Python
@@ -279,6 +277,8 @@ We are investigating potential issues with SAFIM evaluation.
279
  | Mellum-4b-sft-python | 80.45% | 48.19% | 37.68% |
280
  | Mellum-4b-base | 66.21% | 38.52% | 29.70% |
281
 
 
 
282
  # Limitations
283
  - Biases: May reflect biases present in public codebases. For example it will likely produce code which is similar in style to the open-source repositories.
284
  - Security: Code suggestions should not be assumed to be secure or free of vulnerabilities.
 
267
  | Mellum-4b-sft-python | 33.16% | 36.11% | 57.10% | 42.12% |
268
  | Mellum-4b-base | 25.30% | 38.39% | 50.65% | 38.11% |
269
 
 
 
270
  ## HumanEval Infilling
271
  - Type: single-line and multi-line
272
  - Languages: Python
 
277
  | Mellum-4b-sft-python | 80.45% | 48.19% | 37.68% |
278
  | Mellum-4b-base | 66.21% | 38.52% | 29.70% |
279
 
280
+ We continue to work on model improvements and will share the next iteration soon.
281
+
282
  # Limitations
283
  - Biases: May reflect biases present in public codebases. For example it will likely produce code which is similar in style to the open-source repositories.
284
  - Security: Code suggestions should not be assumed to be secure or free of vulnerabilities.