cpatonn commited on
Commit
70dd391
·
verified ·
1 Parent(s): 29589f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -4,6 +4,24 @@ library_name: transformers
4
  base_model:
5
  - deepcogito/cogito-v2-preview-llama-109B-MoE
6
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
 
8
  <p align="center">
9
  <img src="images/deep-cogito-logo.png" alt="Logo" width="40%">
 
4
  base_model:
5
  - deepcogito/cogito-v2-preview-llama-109B-MoE
6
  ---
7
+ # Cogito v2 preview - 109B MoE - GPTQ 4bit
8
+
9
+ ## Method
10
+ Quantised using [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor.git), [nvidia/Llama-Nemotron-Post-Training-Dataset](https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset) and the following configs:
11
+ ```
12
+ recipe = GPTQModifier(
13
+ targets="Linear",
14
+ scheme="W4A16",
15
+ ignore=[
16
+ "re:.*lm_head",
17
+ "re:.*self_attn",
18
+ "re:.*router",
19
+ "re:vision_model.*",
20
+ "re:multi_modal_projector.*",
21
+ "Llama4TextAttention",
22
+ ],
23
+ )
24
+ ```
25
 
26
  <p align="center">
27
  <img src="images/deep-cogito-logo.png" alt="Logo" width="40%">