matrixportal commited on
Commit
354f53b
·
verified ·
1 Parent(s): 9dc4359

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +20 -59
README.md CHANGED
@@ -28,7 +28,7 @@ library_name: transformers
28
  license: cc-by-nc-4.0
29
  tags:
30
  - llama-cpp
31
- - gguf-my-repo
32
  inference: false
33
  extra_gated_prompt: By submitting this form, you agree to the [License Agreement](https://cohere.com/c4ai-cc-by-nc-license) and
34
  acknowledge that the information you provide will be collected, used, and shared
@@ -292,66 +292,27 @@ extra_gated_fields:
292
  I agree to use this model for non-commercial use ONLY: checkbox
293
  ---
294
 
295
- ## Quant List
296
-
297
- You can download the desired quant version from the list here.
298
- | Link | Type | Size/GB | Notes |
299
- |:-----|:-----|--------:|:------|
300
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q2_k.gguf) | Q2_K | 3.44 | |
301
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_s.gguf) | Q3_K_S | 3.87 | |
302
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_m.gguf) | Q3_K_M | 4.22 | lower quality |
303
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_l.gguf) | Q3_K_L | 4.53 | |
304
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_0.gguf) | Q4_0 | 4.80 | Arm, fast |
305
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_k_s.gguf) | Q4_K_S | 4.83 | fast, recommended |
306
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_m.gguf) | Q4_K_M | 5.06 | fast, recommended |
307
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_0.gguf) | Q5_0 | 5.67 | |
308
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_s.gguf) | Q5_K_S | 5.67 | |
309
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_m.gguf) | Q5_K_M | 5.8 | |
310
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q6_k.gguf) | Q6_K | 6.60 | very good quality |
311
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q8_0.gguf) | Q8_0 | 8.54 | fast, best quality |
312
- | [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-f16.gguf) | f16 | 16.07 | 16 bpw, overkill |
313
-
314
-
315
  # matrixportal/aya-23-8B-GGUF
316
- This model was converted to GGUF format from [`CohereForAI/aya-23-8B`](https://huggingface.co/CohereForAI/aya-23-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
317
  Refer to the [original model card](https://huggingface.co/CohereForAI/aya-23-8B) for more details on the model.
318
 
319
- ## Use with llama.cpp
320
- Install llama.cpp through brew (works on Mac and Linux)
321
-
322
- ```bash
323
- brew install llama.cpp
324
-
325
- ```
326
- Invoke the llama.cpp server or the CLI.
327
-
328
- ### CLI:
329
- ```bash
330
- llama-cli --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -p "The meaning to life and the universe is"
331
- ```
332
-
333
- ### Server:
334
- ```bash
335
- llama-server --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -c 2048
336
- ```
337
-
338
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
339
-
340
- Step 1: Clone llama.cpp from GitHub.
341
- ```
342
- git clone https://github.com/ggerganov/llama.cpp
343
- ```
344
 
345
- Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
346
- ```
347
- cd llama.cpp && LLAMA_CURL=1 make
348
- ```
 
 
 
 
 
 
 
 
 
 
 
349
 
350
- Step 3: Run inference through the main binary.
351
- ```
352
- ./llama-cli --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -p "The meaning to life and the universe is"
353
- ```
354
- or
355
- ```
356
- ./llama-server --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -c 2048
357
- ```
 
28
  license: cc-by-nc-4.0
29
  tags:
30
  - llama-cpp
31
+ - matrixportal
32
  inference: false
33
  extra_gated_prompt: By submitting this form, you agree to the [License Agreement](https://cohere.com/c4ai-cc-by-nc-license) and
34
  acknowledge that the information you provide will be collected, used, and shared
 
292
  I agree to use this model for non-commercial use ONLY: checkbox
293
  ---
294
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
295
  # matrixportal/aya-23-8B-GGUF
296
+ This model was converted to GGUF format from [`CohereForAI/aya-23-8B`](https://huggingface.co/CohereForAI/aya-23-8B) using llama.cpp via the ggml.ai's [all-gguf-same-where](https://huggingface.co/spaces/matrixportal/all-gguf-same-where) space.
297
  Refer to the [original model card](https://huggingface.co/CohereForAI/aya-23-8B) for more details on the model.
298
 
299
+ ## Quantized Models Download List
300
+ **✨ Recommended for CPU:** `Q4_K_M` | **⚡ Recommended for ARM CPU:** `Q4_0` | **🏆 Best Quality:** `Q8_0`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
301
 
302
+ | 🚀 Download | 🔢 Type | 📝 Notes |
303
+ |:---------|:-----|:------|
304
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q2_k.gguf) | ![Q2_K](https://img.shields.io/badge/Q2_K-1A73E8) | Basic quantization |
305
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_s.gguf) | ![Q3_K_S](https://img.shields.io/badge/Q3_K_S-34A853) | Small size |
306
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_m.gguf) | ![Q3_K_M](https://img.shields.io/badge/Q3_K_M-FBBC05) | Balanced quality |
307
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_l.gguf) | ![Q3_K_L](https://img.shields.io/badge/Q3_K_L-4285F4) | Better quality |
308
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_0.gguf) | ![Q4_0](https://img.shields.io/badge/Q4_0-EA4335) | Fast on ARM |
309
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_k_s.gguf) | ![Q4_K_S](https://img.shields.io/badge/Q4_K_S-673AB7) | Fast, recommended |
310
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_k_m.gguf) | ![Q4_K_M](https://img.shields.io/badge/Q4_K_M-673AB7) ⭐ | Best balance |
311
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_0.gguf) | ![Q5_0](https://img.shields.io/badge/Q5_0-FF6D01) | Good quality |
312
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_s.gguf) | ![Q5_K_S](https://img.shields.io/badge/Q5_K_S-0F9D58) | Balanced |
313
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_m.gguf) | ![Q5_K_M](https://img.shields.io/badge/Q5_K_M-0F9D58) | High quality |
314
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q6_k.gguf) | ![Q6_K](https://img.shields.io/badge/Q6_K-4285F4) 🏆 | Very good quality |
315
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q8_0.gguf) | ![Q8_0](https://img.shields.io/badge/Q8_0-EA4335) ⚡ | Fast, best quality |
316
+ | [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-f16.gguf) | ![F16](https://img.shields.io/badge/F16-000000) | Maximum accuracy |
317
 
318
+ 💡 **Tip:** Use `F16` for maximum precision when quality is critical