helenai commited on
Commit
d4df5b9
·
verified ·
1 Parent(s): 4ae85b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -3
README.md CHANGED
@@ -1,3 +1,67 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This is the [inceptionai/jais-adapted-13b-chat](https://huggingface.co/inceptionai/jais-adapted-13b-chat) model converted to [OpenVINO](https://docs.openvino.ai/2025/index.html)
2
+ with INT4 weight compression.
3
+
4
+ ## Download the model
5
+
6
+ - Install huggingface-hub
7
+
8
+ ```sh
9
+ pip install huggingface-hub[cli]
10
+ ```
11
+
12
+ - Download the model
13
+
14
+ ```sh
15
+ huggingface-cli download helenai/jais-adapted-13b-chat-ov-int4-sym --local-dir jais-adapted-13b-chat-ov-int4-sym
16
+ ```
17
+
18
+ ## Run inference
19
+
20
+ The recommend way to run inference with this model is with [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai). It is the only package needed
21
+ for inference - no need to install Transformers or PyTorch.
22
+
23
+ - Install OpenVINO GenAI (2025.1 or later):
24
+
25
+ ```sh
26
+ pip install --pre --upgrade openvino-genai
27
+ ```
28
+
29
+ - Download a chat sample script (curl -O works on Windows Command Prompt and most Linux terminals)
30
+
31
+ ```sh
32
+ curl -O https://raw.githubusercontent.com/helena-intel/snippets/refs/heads/main/llm_chat/python/llm_chat.py
33
+ ```
34
+
35
+ - Run the chat script with the path to the model and the device as parameters. Change GPU to CPU to run on CPU. NPU is not yet supported for this model
36
+
37
+ ```sh
38
+ python llm_chat.py jais-adapted-13b-chat-ov-int4-sym GPU
39
+ ```
40
+
41
+ ## More information
42
+
43
+ Check out [OpenVINO GenAI documentation](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai.html) and [OpenVINO GenAI samples](https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/text_generation).
44
+
45
+ ## Model compression parameters
46
+
47
+ ```
48
+ openvino_version : 2025.0.0-17942-1f68be9f594-releases/2025/0
49
+
50
+ advanced_parameters : {'statistics_path': None, 'awq_params': {'subset_size': 32, 'percent_to_apply': 0.002, 'alpha_min': 0.0, 'alpha_max': 1.0, 'steps': 100}, 'scale_estimation_params': {'subset_size': 64, 'initial_steps': 5, 'scale_steps': 5, 'weight_penalty': -1.0}, 'gptq_params': {'damp_percent': 0.1, 'block_size': 128, 'subset_size': 128}, 'lora_correction_params': {'adapter_rank': 8, 'num_iterations': 3, 'apply_regularization': True, 'subset_size': 128, 'use_int8_adapters': True}}
51
+ all_layers : False
52
+ awq : False
53
+ backup_mode : int8_asym
54
+ gptq : False
55
+ group_size : -1
56
+ ignored_scope : []
57
+ lora_correction : False
58
+ mode : int4_sym
59
+ ratio : 1.0
60
+ scale_estimation : False
61
+ sensitivity_metric : weight_quantization_error
62
+
63
+ optimum_intel_version : 1.22.0
64
+ optimum_version : 1.24.0
65
+ pytorch_version : 2.5.1+cpu
66
+ transformers_version : 4.44.2
67
+ ```