Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -28,7 +28,7 @@ library_name: transformers
|
|
28 |
license: cc-by-nc-4.0
|
29 |
tags:
|
30 |
- llama-cpp
|
31 |
-
-
|
32 |
inference: false
|
33 |
extra_gated_prompt: By submitting this form, you agree to the [License Agreement](https://cohere.com/c4ai-cc-by-nc-license) and
|
34 |
acknowledge that the information you provide will be collected, used, and shared
|
@@ -292,66 +292,27 @@ extra_gated_fields:
|
|
292 |
I agree to use this model for non-commercial use ONLY: checkbox
|
293 |
---
|
294 |
|
295 |
-
## Quant List
|
296 |
-
|
297 |
-
You can download the desired quant version from the list here.
|
298 |
-
| Link | Type | Size/GB | Notes |
|
299 |
-
|:-----|:-----|--------:|:------|
|
300 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q2_k.gguf) | Q2_K | 3.44 | |
|
301 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_s.gguf) | Q3_K_S | 3.87 | |
|
302 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_m.gguf) | Q3_K_M | 4.22 | lower quality |
|
303 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_l.gguf) | Q3_K_L | 4.53 | |
|
304 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_0.gguf) | Q4_0 | 4.80 | Arm, fast |
|
305 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_k_s.gguf) | Q4_K_S | 4.83 | fast, recommended |
|
306 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_m.gguf) | Q4_K_M | 5.06 | fast, recommended |
|
307 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_0.gguf) | Q5_0 | 5.67 | |
|
308 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_s.gguf) | Q5_K_S | 5.67 | |
|
309 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_m.gguf) | Q5_K_M | 5.8 | |
|
310 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q6_k.gguf) | Q6_K | 6.60 | very good quality |
|
311 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q8_0.gguf) | Q8_0 | 8.54 | fast, best quality |
|
312 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-f16.gguf) | f16 | 16.07 | 16 bpw, overkill |
|
313 |
-
|
314 |
-
|
315 |
# matrixportal/aya-23-8B-GGUF
|
316 |
-
This model was converted to GGUF format from [`CohereForAI/aya-23-8B`](https://huggingface.co/CohereForAI/aya-23-8B) using llama.cpp via the ggml.ai's [
|
317 |
Refer to the [original model card](https://huggingface.co/CohereForAI/aya-23-8B) for more details on the model.
|
318 |
|
319 |
-
##
|
320 |
-
|
321 |
-
|
322 |
-
```bash
|
323 |
-
brew install llama.cpp
|
324 |
-
|
325 |
-
```
|
326 |
-
Invoke the llama.cpp server or the CLI.
|
327 |
-
|
328 |
-
### CLI:
|
329 |
-
```bash
|
330 |
-
llama-cli --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -p "The meaning to life and the universe is"
|
331 |
-
```
|
332 |
-
|
333 |
-
### Server:
|
334 |
-
```bash
|
335 |
-
llama-server --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -c 2048
|
336 |
-
```
|
337 |
-
|
338 |
-
Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
|
339 |
-
|
340 |
-
Step 1: Clone llama.cpp from GitHub.
|
341 |
-
```
|
342 |
-
git clone https://github.com/ggerganov/llama.cpp
|
343 |
-
```
|
344 |
|
345 |
-
|
346 |
-
|
347 |
-
|
348 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
349 |
|
350 |
-
|
351 |
-
```
|
352 |
-
./llama-cli --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -p "The meaning to life and the universe is"
|
353 |
-
```
|
354 |
-
or
|
355 |
-
```
|
356 |
-
./llama-server --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -c 2048
|
357 |
-
```
|
|
|
28 |
license: cc-by-nc-4.0
|
29 |
tags:
|
30 |
- llama-cpp
|
31 |
+
- matrixportal
|
32 |
inference: false
|
33 |
extra_gated_prompt: By submitting this form, you agree to the [License Agreement](https://cohere.com/c4ai-cc-by-nc-license) and
|
34 |
acknowledge that the information you provide will be collected, used, and shared
|
|
|
292 |
I agree to use this model for non-commercial use ONLY: checkbox
|
293 |
---
|
294 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
295 |
# matrixportal/aya-23-8B-GGUF
|
296 |
+
This model was converted to GGUF format from [`CohereForAI/aya-23-8B`](https://huggingface.co/CohereForAI/aya-23-8B) using llama.cpp via the ggml.ai's [all-gguf-same-where](https://huggingface.co/spaces/matrixportal/all-gguf-same-where) space.
|
297 |
Refer to the [original model card](https://huggingface.co/CohereForAI/aya-23-8B) for more details on the model.
|
298 |
|
299 |
+
## ✅ Quantized Models Download List
|
300 |
+
**✨ Recommended for CPU:** `Q4_K_M` | **⚡ Recommended for ARM CPU:** `Q4_0` | **🏆 Best Quality:** `Q8_0`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
301 |
|
302 |
+
| 🚀 Download | 🔢 Type | 📝 Notes |
|
303 |
+
|:---------|:-----|:------|
|
304 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q2_k.gguf) |  | Basic quantization |
|
305 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_s.gguf) |  | Small size |
|
306 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_m.gguf) |  | Balanced quality |
|
307 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_l.gguf) |  | Better quality |
|
308 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_0.gguf) |  | Fast on ARM |
|
309 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_k_s.gguf) |  | Fast, recommended |
|
310 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_k_m.gguf) |  ⭐ | Best balance |
|
311 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_0.gguf) |  | Good quality |
|
312 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_s.gguf) |  | Balanced |
|
313 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_m.gguf) |  | High quality |
|
314 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q6_k.gguf) |  🏆 | Very good quality |
|
315 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q8_0.gguf) |  ⚡ | Fast, best quality |
|
316 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-f16.gguf) |  | Maximum accuracy |
|
317 |
|
318 |
+
💡 **Tip:** Use `F16` for maximum precision when quality is critical
|
|
|
|
|
|
|
|
|
|
|
|
|
|