imi2 commited on
Commit
6466219
·
verified ·
1 Parent(s): 0801f09

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -49,7 +49,7 @@ TODO:
49
 
50
  ------------
51
 
52
- ## T-MAC (larger groupsize 128?)
53
 
54
  | Model | Size | Params | Backend | Threads | Test | t/s (tokens/sec) |
55
  |-------------------------|---------|--------|---------|---------|--------|----------------------|
@@ -58,4 +58,5 @@ TODO:
58
  | qwen2 ?B INT_N Q4_K | 1.70 GiB| 3.40 B | CPU | 4 | pp512 | 59.66 ± 0.10 |
59
  | qwen2 ?B INT_N Q4_K | 1.70 GiB| 3.40 B | CPU | 4 | tg128 | 26.43 ± 0.14 |
60
 
61
- [Test Issue Link](https://github.com/microsoft/T-MAC/issues/79)
 
 
49
 
50
  ------------
51
 
52
+ ## llama.cpp Q4_K_M scheme and T-MAC inference -groupsize 128?
53
 
54
  | Model | Size | Params | Backend | Threads | Test | t/s (tokens/sec) |
55
  |-------------------------|---------|--------|---------|---------|--------|----------------------|
 
58
  | qwen2 ?B INT_N Q4_K | 1.70 GiB| 3.40 B | CPU | 4 | pp512 | 59.66 ± 0.10 |
59
  | qwen2 ?B INT_N Q4_K | 1.70 GiB| 3.40 B | CPU | 4 | tg128 | 26.43 ± 0.14 |
60
 
61
+ **It's 16.3% faster and 13% smaller.**
62
+ - [Issue Link](https://github.com/microsoft/T-MAC/issues/79)