Update README.md
Browse files
README.md
CHANGED
@@ -136,13 +136,12 @@ Benchmarking is one of the most important procedures during model acceleration.
|
|
136 |
|
137 |
### Latency benchmarks
|
138 |
|
139 |
-
TODO: UPLOAD BENCHS
|
140 |
__100 input/300 output; tok/s:__
|
141 |
|
142 |
| GPU/Model | S | M | L | XL | Original | W8A8, int8 |
|
143 |
|-----------|-----|---|---|----|----------|------------|
|
144 |
-
| H100 |
|
145 |
-
| L40s |
|
146 |
|
147 |
|
148 |
|
|
|
136 |
|
137 |
### Latency benchmarks
|
138 |
|
|
|
139 |
__100 input/300 output; tok/s:__
|
140 |
|
141 |
| GPU/Model | S | M | L | XL | Original | W8A8, int8 |
|
142 |
|-----------|-----|---|---|----|----------|------------|
|
143 |
+
| H100 | 436 | 436 | 409 | 396 | 110 | 439 |
|
144 |
+
| L40s | 290 | 251 | 222 | 210 | 103 | 300 |
|
145 |
|
146 |
|
147 |
|