quazim commited on
Commit
49fa280
·
verified ·
1 Parent(s): 3b5baca

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -136,13 +136,12 @@ Benchmarking is one of the most important procedures during model acceleration.
136
 
137
  ### Latency benchmarks
138
 
139
- TODO: UPLOAD BENCHS
140
  __100 input/300 output; tok/s:__
141
 
142
  | GPU/Model | S | M | L | XL | Original | W8A8, int8 |
143
  |-----------|-----|---|---|----|----------|------------|
144
- | H100 | 189 | 166 | 148 | 134 | 49 | 192 |
145
- | L40s | 79 | 68 | 59 | 47 | 38 | 82 |
146
 
147
 
148
 
 
136
 
137
  ### Latency benchmarks
138
 
 
139
  __100 input/300 output; tok/s:__
140
 
141
  | GPU/Model | S | M | L | XL | Original | W8A8, int8 |
142
  |-----------|-----|---|---|----|----------|------------|
143
+ | H100 | 436 | 436 | 409 | 396 | 110 | 439 |
144
+ | L40s | 290 | 251 | 222 | 210 | 103 | 300 |
145
 
146
 
147