fbaldassarri commited on
Commit
da1fd73
·
verified ·
1 Parent(s): fde938c

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -33,14 +33,14 @@ quantized_by: fbaldassarri
33
  ## Model Information
34
 
35
  Quantized version of [tiiuae/Falcon3-10B-Instruct](https://huggingface.co/tiiuae/Falcon3-10B-Instruct) using torch.float32 for quantization tuning.
36
- - 4 bits (INT4)
37
  - group size = 128
38
  - Asymmetrical Quantization
39
  - Method AutoGPTQ
40
 
41
- Quantization framework: [Intel AutoRound](https://github.com/intel/auto-round) v0.4.4
42
 
43
- Note: this INT4 version of Falcon3-10B-Instruct has been quantized to run inference through CPU.
44
 
45
  ## Replication Recipe
46
 
@@ -49,9 +49,9 @@ Note: this INT4 version of Falcon3-10B-Instruct has been quantized to run infere
49
  I suggest to install requirements into a dedicated python-virtualenv or a conda enviroment.
50
 
51
  ```
52
- wget https://github.com/intel/auto-round/archive/refs/tags/v0.4.4.tar.gz
53
- tar -xvzf v0.4.4.tar.gz
54
- cd auto-round-0.4.4
55
  pip install -r requirements-cpu.txt --upgrade
56
  ```
57
 
@@ -69,10 +69,10 @@ pip install -vvv --no-build-isolation -e .[cpu]
69
  model = AutoModelForCausalLM.from_pretrained(model_name)
70
  tokenizer = AutoTokenizer.from_pretrained(model_name)
71
  from auto_round import AutoRound
72
- bits, group_size, sym, device, amp = 4, 128, False, 'cpu', False
73
  autoround = AutoRound(model, tokenizer, nsamples=128, iters=200, seqlen=512, batch_size=4, bits=bits, group_size=group_size, sym=sym, device=device, amp=amp)
74
  autoround.quantize()
75
- output_dir = "./AutoRound/tiiuae_Falcon3-10B-Instruct-autogptq-int4-gs128-asym"
76
  autoround.save_quantized(output_dir, format='auto_gptq', inplace=True)
77
  ```
78
 
 
33
  ## Model Information
34
 
35
  Quantized version of [tiiuae/Falcon3-10B-Instruct](https://huggingface.co/tiiuae/Falcon3-10B-Instruct) using torch.float32 for quantization tuning.
36
+ - 8 bits (INT8)
37
  - group size = 128
38
  - Asymmetrical Quantization
39
  - Method AutoGPTQ
40
 
41
+ Quantization framework: [Intel AutoRound](https://github.com/intel/auto-round) v0.4.5
42
 
43
+ Note: this INT8 version of Falcon3-10B-Instruct has been quantized to run inference through CPU.
44
 
45
  ## Replication Recipe
46
 
 
49
  I suggest to install requirements into a dedicated python-virtualenv or a conda enviroment.
50
 
51
  ```
52
+ wget https://github.com/intel/auto-round/archive/refs/tags/v0.4.5.tar.gz
53
+ tar -xvzf v0.4.5.tar.gz
54
+ cd auto-round-0.4.5
55
  pip install -r requirements-cpu.txt --upgrade
56
  ```
57
 
 
69
  model = AutoModelForCausalLM.from_pretrained(model_name)
70
  tokenizer = AutoTokenizer.from_pretrained(model_name)
71
  from auto_round import AutoRound
72
+ bits, group_size, sym, device, amp = 8, 128, False, 'cpu', False
73
  autoround = AutoRound(model, tokenizer, nsamples=128, iters=200, seqlen=512, batch_size=4, bits=bits, group_size=group_size, sym=sym, device=device, amp=amp)
74
  autoround.quantize()
75
+ output_dir = "./AutoRound/tiiuae_Falcon3-10B-Instruct-autogptq-int8-gs128-asym"
76
  autoround.save_quantized(output_dir, format='auto_gptq', inplace=True)
77
  ```
78