JetBrains
/

Mellum-4b-base

Text Generation

text-generation-inference

Model card Files Files and versions Community

topshik commited on 5 days ago

Commit

99a3423

·

verified ·

1 Parent(s): 15315f0

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -267,8 +267,6 @@ Since Mellum has a maximum context window of 8k, we report here both the average
 | Mellum-4b-sft-python | 33.16%      | 36.11%  | 57.10% | 42.12%  |
 | Mellum-4b-base       | 25.30%      | 38.39%  | 50.65% | 38.11%  |
-We are investigating potential issues with SAFIM evaluation.
 ## HumanEval Infilling
 - Type: single-line and multi-line
 - Languages: Python
@@ -279,6 +277,8 @@ We are investigating potential issues with SAFIM evaluation.
 | Mellum-4b-sft-python | 80.45%      | 48.19%     | 37.68%      |
 | Mellum-4b-base       | 66.21%      | 38.52%     | 29.70%      |
 # Limitations
 - Biases: May reflect biases present in public codebases. For example it will likely produce code which is similar in style to the open-source repositories.
 - Security: Code suggestions should not be assumed to be secure or free of vulnerabilities.

 | Mellum-4b-sft-python | 33.16%      | 36.11%  | 57.10% | 42.12%  |
 | Mellum-4b-base       | 25.30%      | 38.39%  | 50.65% | 38.11%  |
 ## HumanEval Infilling
 - Type: single-line and multi-line
 - Languages: Python
 | Mellum-4b-sft-python | 80.45%      | 48.19%     | 37.68%      |
 | Mellum-4b-base       | 66.21%      | 38.52%     | 29.70%      |
+We continue to work on model improvements and will share the next iteration soon.
 # Limitations
 - Biases: May reflect biases present in public codebases. For example it will likely produce code which is similar in style to the open-source repositories.
 - Security: Code suggestions should not be assumed to be secure or free of vulnerabilities.