Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -41,7 +41,7 @@ More details on model performance across various devices, can be found
|
|
41 |
- Decoding length: 4096
|
42 |
- Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
|
43 |
|
44 |
-
| Model | Device | Chipset | Target Runtime | Response Rate (
|
45 |
|---|---|---|---|---|---|---|
|
46 |
| Mistral-7B-Instruct-v0_3 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 10.73 | 0.18 - 5.79 | 58.85% | Use Export Script |
|
47 |
|
|
|
41 |
- Decoding length: 4096
|
42 |
- Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
|
43 |
|
44 |
+
| Model | Device | Chipset | Target Runtime | Response Rate (tokens per second) | Time To First Token (range, seconds) | Tiny MMLU
|
45 |
|---|---|---|---|---|---|---|
|
46 |
| Mistral-7B-Instruct-v0_3 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 10.73 | 0.18 - 5.79 | 58.85% | Use Export Script |
|
47 |
|