dustalov commited on
Commit
1637e54
·
verified ·
1 Parent(s): 53e514c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -6
README.md CHANGED
@@ -11,6 +11,42 @@ tags:
11
  model-index:
12
  - name: Mellum-4b-base
13
  results:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  - task:
15
  type: text-generation
16
  dataset:
@@ -75,13 +111,15 @@ In addition to the base model scores, we are providing scores for a Mellum fine-
75
  - Languages: Python and Java
76
  - Metric: Exact Match (EM), %
77
 
78
- Python Subset:
79
- | Model | 2K Context | 4K Context | 8K Context |
80
- |----------------------|------------|------------|------------|
81
- | Mellum-4b-sft-python | 28.09% | 30.91% | 30.42% |
82
- | Mellum-4b-base | 26.68% | 27.58% | 26.89% |
 
 
 
83
 
84
- Java Subset:
85
  | Model | 2K Context | 4K Context | 8K Context |
86
  |---------------|------------|------------|------------|
87
  | Mellum-4b-base | 33.15% | 33.48% | 27.79% |
 
11
  model-index:
12
  - name: Mellum-4b-base
13
  results:
14
+ - task:
15
+ type: text-generation
16
+ dataset:
17
+ type: tianyang/repobench_python_v1.1
18
+ name: RepoBench 1.1 (Python)
19
+ metrics:
20
+ - name: 2k
21
+ type: pass@1
22
+ value: 0.2820
23
+ verified: false
24
+ - name: 4k
25
+ type: pass@1
26
+ value: 0.2795
27
+ verified: false
28
+ - name: 8k
29
+ type: pass@1
30
+ value: 0.2777
31
+ verified: false
32
+ - name: 12k
33
+ type: pass@1
34
+ value: 0.2453
35
+ verified: false
36
+ - name: 16k
37
+ type: pass@1
38
+ value: 0.2110
39
+ verified: false
40
+ - task:
41
+ type: text-generation
42
+ dataset:
43
+ type: gonglinyuan/safim
44
+ name: SAFIM
45
+ metrics:
46
+ - name: pass@1
47
+ type: pass@1
48
+ value: 0.3811
49
+ verified: false
50
  - task:
51
  type: text-generation
52
  dataset:
 
111
  - Languages: Python and Java
112
  - Metric: Exact Match (EM), %
113
 
114
+ ### Python Subset
115
+
116
+ | Model | 2k | 4k | 8k | 12k | 16k | Avg | Avg ≤ 8k |
117
+ |------------------------|--------|--------|--------|--------|--------|--------|----------|
118
+ | Mellum-4b-sft-python | 29.24% | 30.60% | 29.77% | 26.80% | 25.43% | 25.91% | 29.87% |
119
+ | Mellum-4b-base | 28.20% | 27.95% | 27.77% | 24.53% | 21.10% | 28.37% | 27.97% |
120
+
121
+ ### Java Subset
122
 
 
123
  | Model | 2K Context | 4K Context | 8K Context |
124
  |---------------|------------|------------|------------|
125
  | Mellum-4b-base | 33.15% | 33.48% | 27.79% |