Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,67 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
This is the [inceptionai/jais-adapted-13b-chat](https://huggingface.co/inceptionai/jais-adapted-13b-chat) model converted to [OpenVINO](https://docs.openvino.ai/2025/index.html)
|
2 |
+
with INT4 weight compression.
|
3 |
+
|
4 |
+
## Download the model
|
5 |
+
|
6 |
+
- Install huggingface-hub
|
7 |
+
|
8 |
+
```sh
|
9 |
+
pip install huggingface-hub[cli]
|
10 |
+
```
|
11 |
+
|
12 |
+
- Download the model
|
13 |
+
|
14 |
+
```sh
|
15 |
+
huggingface-cli download helenai/jais-adapted-13b-chat-ov-int4-sym --local-dir jais-adapted-13b-chat-ov-int4-sym
|
16 |
+
```
|
17 |
+
|
18 |
+
## Run inference
|
19 |
+
|
20 |
+
The recommend way to run inference with this model is with [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai). It is the only package needed
|
21 |
+
for inference - no need to install Transformers or PyTorch.
|
22 |
+
|
23 |
+
- Install OpenVINO GenAI (2025.1 or later):
|
24 |
+
|
25 |
+
```sh
|
26 |
+
pip install --pre --upgrade openvino-genai
|
27 |
+
```
|
28 |
+
|
29 |
+
- Download a chat sample script (curl -O works on Windows Command Prompt and most Linux terminals)
|
30 |
+
|
31 |
+
```sh
|
32 |
+
curl -O https://raw.githubusercontent.com/helena-intel/snippets/refs/heads/main/llm_chat/python/llm_chat.py
|
33 |
+
```
|
34 |
+
|
35 |
+
- Run the chat script with the path to the model and the device as parameters. Change GPU to CPU to run on CPU. NPU is not yet supported for this model
|
36 |
+
|
37 |
+
```sh
|
38 |
+
python llm_chat.py jais-adapted-13b-chat-ov-int4-sym GPU
|
39 |
+
```
|
40 |
+
|
41 |
+
## More information
|
42 |
+
|
43 |
+
Check out [OpenVINO GenAI documentation](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai.html) and [OpenVINO GenAI samples](https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/text_generation).
|
44 |
+
|
45 |
+
## Model compression parameters
|
46 |
+
|
47 |
+
```
|
48 |
+
openvino_version : 2025.0.0-17942-1f68be9f594-releases/2025/0
|
49 |
+
|
50 |
+
advanced_parameters : {'statistics_path': None, 'awq_params': {'subset_size': 32, 'percent_to_apply': 0.002, 'alpha_min': 0.0, 'alpha_max': 1.0, 'steps': 100}, 'scale_estimation_params': {'subset_size': 64, 'initial_steps': 5, 'scale_steps': 5, 'weight_penalty': -1.0}, 'gptq_params': {'damp_percent': 0.1, 'block_size': 128, 'subset_size': 128}, 'lora_correction_params': {'adapter_rank': 8, 'num_iterations': 3, 'apply_regularization': True, 'subset_size': 128, 'use_int8_adapters': True}}
|
51 |
+
all_layers : False
|
52 |
+
awq : False
|
53 |
+
backup_mode : int8_asym
|
54 |
+
gptq : False
|
55 |
+
group_size : -1
|
56 |
+
ignored_scope : []
|
57 |
+
lora_correction : False
|
58 |
+
mode : int4_sym
|
59 |
+
ratio : 1.0
|
60 |
+
scale_estimation : False
|
61 |
+
sensitivity_metric : weight_quantization_error
|
62 |
+
|
63 |
+
optimum_intel_version : 1.22.0
|
64 |
+
optimum_version : 1.24.0
|
65 |
+
pytorch_version : 2.5.1+cpu
|
66 |
+
transformers_version : 4.44.2
|
67 |
+
```
|