matrixportal commited on
Commit
734e135
·
verified ·
1 Parent(s): 284e37f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +73 -46
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  language:
3
  - en
4
  - de
@@ -9,6 +10,7 @@ language:
9
  - es
10
  - th
11
  library_name: transformers
 
12
  pipeline_tag: text-generation
13
  tags:
14
  - facebook
@@ -17,8 +19,7 @@ tags:
17
  - llama
18
  - llama-3
19
  - llama-cpp
20
- - gguf-my-repo
21
- license: llama3.2
22
  extra_gated_prompt: "### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT\n\nLlama 3.2 Version\
23
  \ Release Date: September 25, 2024\n\n“Agreement” means the terms and conditions\
24
  \ for use, reproduction, distribution and modification of the Llama Materials set\
@@ -205,49 +206,75 @@ extra_gated_fields:
205
  extra_gated_description: The information you provide will be collected, stored, processed
206
  and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
207
  extra_gated_button_content: Submit
208
- base_model: meta-llama/Llama-3.2-3B-Instruct
209
  ---
210
 
211
- # matrixportal/Llama-3.2-3B-Instruct-GGUF
212
- This model was converted to GGUF format from [`meta-llama/Llama-3.2-3B-Instruct`](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
213
- Refer to the [original model card](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) for more details on the model.
214
-
215
- ## Use with llama.cpp
216
- Install llama.cpp through brew (works on Mac and Linux)
217
-
218
- ```bash
219
- brew install llama.cpp
220
-
221
- ```
222
- Invoke the llama.cpp server or the CLI.
223
-
224
- ### CLI:
225
- ```bash
226
- llama-cli --hf-repo matrixportal/Llama-3.2-3B-Instruct-GGUF --hf-file llama-3.2-3b-instruct-f16.gguf -p "The meaning to life and the universe is"
227
- ```
228
-
229
- ### Server:
230
- ```bash
231
- llama-server --hf-repo matrixportal/Llama-3.2-3B-Instruct-GGUF --hf-file llama-3.2-3b-instruct-f16.gguf -c 2048
232
- ```
233
-
234
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
235
-
236
- Step 1: Clone llama.cpp from GitHub.
237
- ```
238
- git clone https://github.com/ggerganov/llama.cpp
239
- ```
240
-
241
- Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
242
- ```
243
- cd llama.cpp && LLAMA_CURL=1 make
244
- ```
245
-
246
- Step 3: Run inference through the main binary.
247
- ```
248
- ./llama-cli --hf-repo matrixportal/Llama-3.2-3B-Instruct-GGUF --hf-file llama-3.2-3b-instruct-f16.gguf -p "The meaning to life and the universe is"
249
- ```
250
- or
251
- ```
252
- ./llama-server --hf-repo matrixportal/Llama-3.2-3B-Instruct-GGUF --hf-file llama-3.2-3b-instruct-f16.gguf -c 2048
253
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: meta-llama/Llama-3.2-3B-Instruct
3
  language:
4
  - en
5
  - de
 
10
  - es
11
  - th
12
  library_name: transformers
13
+ license: llama3.2
14
  pipeline_tag: text-generation
15
  tags:
16
  - facebook
 
19
  - llama
20
  - llama-3
21
  - llama-cpp
22
+ - matrixportal
 
23
  extra_gated_prompt: "### LLAMA 3.2 COMMUNITY LICENSE AGREEMENT\n\nLlama 3.2 Version\
24
  \ Release Date: September 25, 2024\n\n“Agreement” means the terms and conditions\
25
  \ for use, reproduction, distribution and modification of the Llama Materials set\
 
206
  extra_gated_description: The information you provide will be collected, stored, processed
207
  and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
208
  extra_gated_button_content: Submit
 
209
  ---
210
 
211
+ - **Base model:** [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
212
+ - **License:** [Llama 3 Community License](https://llama.meta.com/llama3/license)
213
+
214
+ Quantized with llama.cpp using [all-gguf-same-where](https://huggingface.co/spaces/matrixportal/all-gguf-same-where)
215
+
216
+ ## Quantized Models Download List
217
+
218
+ ### 🔍 Recommended Quantizations
219
+ - **✨ General CPU Use:** [`Q4_K_M`](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q4_k_m.gguf) (Best balance of speed/quality)
220
+ - **📱 ARM Devices:** [`Q4_0`](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q4_0.gguf) (Optimized for ARM CPUs)
221
+ - **🏆 Maximum Quality:** [`Q8_0`](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q8_0.gguf) (Near-original quality)
222
+
223
+ ### 📦 Full Quantization Options
224
+ | 🚀 Download | 🔢 Type | 📝 Notes |
225
+ |:---------|:-----|:------|
226
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q2_k.gguf) | ![Q2_K](https://img.shields.io/badge/Q2_K-1A73E8) | Basic quantization |
227
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q3_k_s.gguf) | ![Q3_K_S](https://img.shields.io/badge/Q3_K_S-34A853) | Small size |
228
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q3_k_m.gguf) | ![Q3_K_M](https://img.shields.io/badge/Q3_K_M-FBBC05) | Balanced quality |
229
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q3_k_l.gguf) | ![Q3_K_L](https://img.shields.io/badge/Q3_K_L-4285F4) | Better quality |
230
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q4_0.gguf) | ![Q4_0](https://img.shields.io/badge/Q4_0-EA4335) | Fast on ARM |
231
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q4_k_s.gguf) | ![Q4_K_S](https://img.shields.io/badge/Q4_K_S-673AB7) | Fast, recommended |
232
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q4_k_m.gguf) | ![Q4_K_M](https://img.shields.io/badge/Q4_K_M-673AB7) ⭐ | Best balance |
233
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q5_0.gguf) | ![Q5_0](https://img.shields.io/badge/Q5_0-FF6D01) | Good quality |
234
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q5_k_s.gguf) | ![Q5_K_S](https://img.shields.io/badge/Q5_K_S-0F9D58) | Balanced |
235
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q5_k_m.gguf) | ![Q5_K_M](https://img.shields.io/badge/Q5_K_M-0F9D58) | High quality |
236
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q6_k.gguf) | ![Q6_K](https://img.shields.io/badge/Q6_K-4285F4) 🏆 | Very good quality |
237
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-q8_0.gguf) | ![Q8_0](https://img.shields.io/badge/Q8_0-EA4335) ⚡ | Fast, best quality |
238
+ | [Download](https://huggingface.co/matrixportal/Llama-3.2-3B-Instruct-GGUF/resolve/main/llama-3.2-3b-instruct-f16.gguf) | ![F16](https://img.shields.io/badge/F16-000000) | Maximum accuracy |
239
+
240
+ 💡 **Tip:** Use `F16` for maximum precision when quality is critical
241
+
242
+
243
+ ---
244
+ # 🚀 Applications and Tools for Locally Quantized LLMs
245
+ ## 🖥️ Desktop Applications
246
+
247
+ | Application | Description | Download Link |
248
+ |-----------------|----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------|
249
+ | **Llama.cpp** | A fast and efficient inference engine for GGUF models. | [GitHub Repository](https://github.com/ggml-org/llama.cpp) |
250
+ | **Ollama** | A streamlined solution for running LLMs locally. | [Website](https://ollama.com/) |
251
+ | **AnythingLLM** | An AI-powered knowledge management tool. | [GitHub Repository](https://github.com/Mintplex-Labs/anything-llm) |
252
+ | **Open WebUI** | A user-friendly web interface for running local LLMs. | [GitHub Repository](https://github.com/open-webui/open-webui) |
253
+ | **GPT4All** | A user-friendly desktop application supporting various LLMs, compatible with GGUF models. | [GitHub Repository](https://github.com/nomic-ai/gpt4all) |
254
+ | **LM Studio** | A desktop application designed to run and manage local LLMs, supporting GGUF format. | [Website](https://lmstudio.ai/) |
255
+ | **GPT4All Chat**| A chat application compatible with GGUF models for local, offline interactions. | [GitHub Repository](https://github.com/nomic-ai/gpt4all) |
256
+
257
+ ---
258
+
259
+ ## 📱 Mobile Applications
260
+
261
+ | Application | Description | Download Link |
262
+ |-------------------|----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------|
263
+ | **ChatterUI** | A simple and lightweight LLM app for mobile devices. | [GitHub Repository](https://github.com/Vali-98/ChatterUI) |
264
+ | **Maid** | Mobile Artificial Intelligence Distribution for running AI models on mobile devices. | [GitHub Repository](https://github.com/Mobile-Artificial-Intelligence/maid) |
265
+ | **PocketPal AI** | A mobile AI assistant powered by local models. | [GitHub Repository](https://github.com/a-ghorbani/pocketpal-ai) |
266
+ | **Layla** | A flexible platform for running various AI models on mobile devices. | [Website](https://www.layla-network.ai/) |
267
+
268
+ ---
269
+
270
+ ## 🎨 Image Generation Applications
271
+
272
+ | Application | Description | Download Link |
273
+ |-------------------------------------|----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------|
274
+ | **Stable Diffusion** | An open-source AI model for generating images from text. | [GitHub Repository](https://github.com/CompVis/stable-diffusion) |
275
+ | **Stable Diffusion WebUI** | A web application providing access to Stable Diffusion models via a browser interface. | [GitHub Repository](https://github.com/AUTOMATIC1111/stable-diffusion-webui) |
276
+ | **Local Dream** | Android Stable Diffusion with Snapdragon NPU acceleration. Also supports CPU inference. | [GitHub Repository](https://github.com/xororz/local-dream) |
277
+ | **Stable-Diffusion-Android (SDAI)** | An open-source AI art application for Android devices, enabling digital art creation. | [GitHub Repository](https://github.com/ShiftHackZ/Stable-Diffusion-Android) |
278
+
279
+ ---
280
+