Update README.md
Browse files
README.md
CHANGED
@@ -150,10 +150,10 @@ Benchmarking is one of the most important procedures during model acceleration.
|
|
150 |
|
151 |
| Metric/Model | S | M | L | XL | Original | W8A8, int8 |
|
152 |
|---------------|---|---|---|----|----------|------------|
|
153 |
-
| arc_challenge | 56.20 |
|
154 |
-
| mmlu | 65.60 |
|
155 |
-
| piqa | 80.60 |
|
156 |
-
| winogrande | 74.40 |
|
157 |
|
158 |
|
159 |
|
@@ -163,15 +163,6 @@ Benchmarking is one of the most important procedures during model acceleration.
|
|
163 |
* **Winogrande**: Evaluates commonsense reasoning through sentence completion tasks. Shows model's capability to understand context and resolve ambiguity.
|
164 |
* **GSM8K**: GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems.
|
165 |
|
166 |
-
### Latency benchmarks
|
167 |
-
|
168 |
-
__100 input/300 output; tok/s:__
|
169 |
-
|
170 |
-
| GPU/Model | S | M | L | XL | Original | W8A8, int8 |
|
171 |
-
|-----------|-----|---|---|----|----------|------------|
|
172 |
-
| H100 | 64 | 55 | -1 | -1 | 34 | 66 | - |
|
173 |
-
| L40S | -1 | -1 | -1 | -1 | -1 | -1 | - |
|
174 |
-
|
175 |
|
176 |
### Performance by Context Size
|
177 |
|
|
|
150 |
|
151 |
| Metric/Model | S | M | L | XL | Original | W8A8, int8 |
|
152 |
|---------------|---|---|---|----|----------|------------|
|
153 |
+
| arc_challenge | 56.20 | 55.88 | 56.57 | 57.80 | 57.80 | 53.10 | - |
|
154 |
+
| mmlu | 65.60 | 66.74 | 67.01 | 66.80 | 66.80 | 62.40 | - |
|
155 |
+
| piqa | 80.60 | 81.28 | 81.12 | 81.30 | 81.30 | 79.00 | - |
|
156 |
+
| winogrande | 74.40 | 74.27 | 75.61 | 76.00 | 76.00 | 71.00 | - |
|
157 |
|
158 |
|
159 |
|
|
|
163 |
* **Winogrande**: Evaluates commonsense reasoning through sentence completion tasks. Shows model's capability to understand context and resolve ambiguity.
|
164 |
* **GSM8K**: GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems.
|
165 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
166 |
|
167 |
### Performance by Context Size
|
168 |
|