JetBrains
/

Mellum-4b-base

Text Generation

text-generation-inference

Model card Files Files and versions Community

dustalov commited on 6 days ago

Commit

1637e54

·

verified ·

1 Parent(s): 53e514c

Update README.md

Files changed (1) hide show

README.md +44 -6

README.md CHANGED Viewed

@@ -11,6 +11,42 @@ tags:
 model-index:
 - name: Mellum-4b-base
   results:
   - task:
       type: text-generation
     dataset:
@@ -75,13 +111,15 @@ In addition to the base model scores, we are providing scores for a Mellum fine-
 - Languages: Python and Java
 - Metric: Exact Match (EM), %
-Python Subset:
-| Model                | 2K Context | 4K Context | 8K Context |
-|----------------------|------------|------------|------------|
-| Mellum-4b-sft-python  | 28.09%     | 30.91%     | 30.42%     |
-| Mellum-4b-base        | 26.68%     | 27.58%     | 26.89%     |
-Java Subset:
 | Model         | 2K Context | 4K Context | 8K Context |
 |---------------|------------|------------|------------|
 | Mellum-4b-base | 33.15%     | 33.48%     | 27.79%     |

 model-index:
 - name: Mellum-4b-base
   results:
+  - task:
+      type: text-generation
+    dataset:
+      type: tianyang/repobench_python_v1.1
+      name: RepoBench 1.1 (Python)
+    metrics:
+    - name: 2k
+      type: pass@1
+      value: 0.2820
+      verified: false
+    - name: 4k
+      type: pass@1
+      value: 0.2795
+      verified: false
+    - name: 8k
+      type: pass@1
+      value: 0.2777
+      verified: false
+    - name: 12k
+      type: pass@1
+      value: 0.2453
+      verified: false
+    - name: 16k
+      type: pass@1
+      value: 0.2110
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+      type: gonglinyuan/safim
+      name: SAFIM
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 0.3811
+      verified: false
   - task:
       type: text-generation
     dataset:
 - Languages: Python and Java
 - Metric: Exact Match (EM), %
+### Python Subset
+| Model                  |   2k   |   4k   |   8k   |  12k   |  16k   |  Avg   | Avg ≤ 8k |
+|------------------------|--------|--------|--------|--------|--------|--------|----------|
+| Mellum-4b-sft-python   | 29.24% | 30.60% | 29.77% | 26.80% | 25.43% | 25.91% |  29.87%  |
+| Mellum-4b-base         | 28.20% | 27.95% | 27.77% | 24.53% | 21.10% | 28.37% |  27.97%  |
+### Java Subset
 | Model         | 2K Context | 4K Context | 8K Context |
 |---------------|------------|------------|------------|
 | Mellum-4b-base | 33.15%     | 33.48%     | 27.79%     |