Update README.md
Browse files
README.md
CHANGED
@@ -4,6 +4,24 @@ library_name: transformers
|
|
4 |
base_model:
|
5 |
- deepcogito/cogito-v2-preview-llama-109B-MoE
|
6 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
|
8 |
<p align="center">
|
9 |
<img src="images/deep-cogito-logo.png" alt="Logo" width="40%">
|
|
|
4 |
base_model:
|
5 |
- deepcogito/cogito-v2-preview-llama-109B-MoE
|
6 |
---
|
7 |
+
# Cogito v2 preview - 109B MoE - GPTQ 4bit
|
8 |
+
|
9 |
+
## Method
|
10 |
+
Quantised using [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor.git), [nvidia/Llama-Nemotron-Post-Training-Dataset](https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset) and the following configs:
|
11 |
+
```
|
12 |
+
recipe = GPTQModifier(
|
13 |
+
targets="Linear",
|
14 |
+
scheme="W4A16",
|
15 |
+
ignore=[
|
16 |
+
"re:.*lm_head",
|
17 |
+
"re:.*self_attn",
|
18 |
+
"re:.*router",
|
19 |
+
"re:vision_model.*",
|
20 |
+
"re:multi_modal_projector.*",
|
21 |
+
"Llama4TextAttention",
|
22 |
+
],
|
23 |
+
)
|
24 |
+
```
|
25 |
|
26 |
<p align="center">
|
27 |
<img src="images/deep-cogito-logo.png" alt="Logo" width="40%">
|