Improve language tag

#1
by lbourdois - opened
Files changed (1) hide show
  1. README.md +59 -45
README.md CHANGED
@@ -1,46 +1,60 @@
1
- ---
2
- library_name: transformers
3
- license: apache-2.0
4
- base_model:
5
- - Qwen/Qwen2.5-32B-Instruct
6
- pipeline_tag: text-generation
7
- ---
8
-
9
- Converted version of [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) to 4-bit using bitsandbytes. For more information about the model,
10
- refer to the model's page.
11
-
12
- ## Impact on performance
13
- Impact of quantization on a set of models.
14
-
15
- Evaluation of the model was conducted using the PoLL (Pool of LLM) technique, assessing performance on **100 French questions** with scores aggregated from six evaluations
16
- (two per evaluator). The evaluators included GPT-4o, Gemini-1.5-pro, and Claude3.5-sonnet.
17
-
18
- Performance Scores (on a scale of 5):
19
- | Model | Score | # params (Billion) | size (GB) |
20
- |---------------------------------------------:|:--------:|:------------------:|:---------:|
21
- | gpt-4o | 4.13 | N/A | N/A |
22
- | gpt-4o-mini | 4.02 | N/A | N/A |
23
- | Qwen/Qwen2.5-32B-Instruct | 3.99 | 32.8 | 65.6 |
24
- | **cmarkea/Qwen2.5-32B-Instruct-4bit** | **3.98** | **32.8** | **16.4** |
25
- | mistralai/Mixtral-8x7B-Instruct-v0.1 | 3.71 | 46.7 | 93.4 |
26
- | cmarkea/Mixtral-8x7B-Instruct-v0.1-4bit | 3.68 | 46.7 | 23.35 |
27
- | meta-llama/Meta-Llama-3.1-70B-Instruct | 3.68 | 70.06 | 140.12 |
28
- | gpt-3.5-turbo | 3.66 | 175 | 350 |
29
- | cmarkea/Meta-Llama-3.1-70B-Instruct-4bit | 3.64 | 70.06 | 35.3 |
30
- | TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ | 3.56 | 46.7 | 46.7 |
31
- | meta-llama/Meta-Llama-3.1-8B-Instruct | 3.25 | 8.03 | 16.06 |
32
- | mistralai/Mistral-7B-Instruct-v0.2 | 1.98 | 7.25 | 14.5 |
33
- | cmarkea/bloomz-7b1-mt-sft-chat | 1.69 | 7.07 | 14.14 |
34
- | cmarkea/bloomz-3b-dpo-chat | 1.68 | 3 | 6 |
35
- | cmarkea/bloomz-3b-sft-chat | 1.51 | 3 | 6 |
36
- | croissantllm/CroissantLLMChat-v0.1 | 1.19 | 1.3 | 2.7 |
37
- | cmarkea/bloomz-560m-sft-chat | 1.04 | 0.56 | 1.12 |
38
- | OpenLLM-France/Claire-Mistral-7B-0.1 | 0.38 | 7.25 | 14.5 |
39
-
40
- The impact of quantization is negligible.
41
-
42
- ## Prompt Pattern
43
- Here is a reminder of the command pattern to interact with the model:
44
- ```verbatim
45
- <|im_start|>user\n{user_prompt_1}<|im_end|>\n<|im_start|>assistant\n{model_answer_1}...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  ```
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model:
5
+ - Qwen/Qwen2.5-32B-Instruct
6
+ pipeline_tag: text-generation
7
+ language:
8
+ - zho
9
+ - eng
10
+ - fra
11
+ - spa
12
+ - por
13
+ - deu
14
+ - ita
15
+ - rus
16
+ - jpn
17
+ - kor
18
+ - vie
19
+ - tha
20
+ - ara
21
+ ---
22
+
23
+ Converted version of [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) to 4-bit using bitsandbytes. For more information about the model,
24
+ refer to the model's page.
25
+
26
+ ## Impact on performance
27
+ Impact of quantization on a set of models.
28
+
29
+ Evaluation of the model was conducted using the PoLL (Pool of LLM) technique, assessing performance on **100 French questions** with scores aggregated from six evaluations
30
+ (two per evaluator). The evaluators included GPT-4o, Gemini-1.5-pro, and Claude3.5-sonnet.
31
+
32
+ Performance Scores (on a scale of 5):
33
+ | Model | Score | # params (Billion) | size (GB) |
34
+ |---------------------------------------------:|:--------:|:------------------:|:---------:|
35
+ | gpt-4o | 4.13 | N/A | N/A |
36
+ | gpt-4o-mini | 4.02 | N/A | N/A |
37
+ | Qwen/Qwen2.5-32B-Instruct | 3.99 | 32.8 | 65.6 |
38
+ | **cmarkea/Qwen2.5-32B-Instruct-4bit** | **3.98** | **32.8** | **16.4** |
39
+ | mistralai/Mixtral-8x7B-Instruct-v0.1 | 3.71 | 46.7 | 93.4 |
40
+ | cmarkea/Mixtral-8x7B-Instruct-v0.1-4bit | 3.68 | 46.7 | 23.35 |
41
+ | meta-llama/Meta-Llama-3.1-70B-Instruct | 3.68 | 70.06 | 140.12 |
42
+ | gpt-3.5-turbo | 3.66 | 175 | 350 |
43
+ | cmarkea/Meta-Llama-3.1-70B-Instruct-4bit | 3.64 | 70.06 | 35.3 |
44
+ | TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ | 3.56 | 46.7 | 46.7 |
45
+ | meta-llama/Meta-Llama-3.1-8B-Instruct | 3.25 | 8.03 | 16.06 |
46
+ | mistralai/Mistral-7B-Instruct-v0.2 | 1.98 | 7.25 | 14.5 |
47
+ | cmarkea/bloomz-7b1-mt-sft-chat | 1.69 | 7.07 | 14.14 |
48
+ | cmarkea/bloomz-3b-dpo-chat | 1.68 | 3 | 6 |
49
+ | cmarkea/bloomz-3b-sft-chat | 1.51 | 3 | 6 |
50
+ | croissantllm/CroissantLLMChat-v0.1 | 1.19 | 1.3 | 2.7 |
51
+ | cmarkea/bloomz-560m-sft-chat | 1.04 | 0.56 | 1.12 |
52
+ | OpenLLM-France/Claire-Mistral-7B-0.1 | 0.38 | 7.25 | 14.5 |
53
+
54
+ The impact of quantization is negligible.
55
+
56
+ ## Prompt Pattern
57
+ Here is a reminder of the command pattern to interact with the model:
58
+ ```verbatim
59
+ <|im_start|>user\n{user_prompt_1}<|im_end|>\n<|im_start|>assistant\n{model_answer_1}...
60
  ```