psynote123 commited on
Commit
f9a40c0
·
verified ·
1 Parent(s): e43f0af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -8
README.md CHANGED
@@ -164,14 +164,6 @@ Benchmarking is one of the most important procedures during model acceleration.
164
 
165
  ### Latency benchmarks
166
 
167
- __100 input/300 output; tok/s:__
168
-
169
- | GPU/Model | S | M | L | XL | Original | W8A8, int8 |
170
- |-----------|-----|---|---|----|----------|------------|
171
- | H100 | 90 | 82 | 72 | 54 | 41 | 95 | - |
172
- | L40S | 25 | 23 | 20 | -1 | -1 | 27 | - |
173
-
174
-
175
  ### Performance by Context Size
176
 
177
  The tables below show performance (tokens per second) for different input context sizes across different GPU models and batch sizes:
 
164
 
165
  ### Latency benchmarks
166
 
 
 
 
 
 
 
 
 
167
  ### Performance by Context Size
168
 
169
  The tables below show performance (tokens per second) for different input context sizes across different GPU models and batch sizes: