lbourdois commited on
Commit
7abc104
·
verified ·
1 Parent(s): 33f18aa

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +149 -137
README.md CHANGED
@@ -1,138 +1,150 @@
1
- ---
2
- base_model: Qwen/Qwen2.5-14B-Instruct
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - qwen2
8
- - trl
9
- - gammacorpus
10
- - zurich
11
- - chat
12
- - conversational
13
- license: apache-2.0
14
- language:
15
- - en
16
- datasets:
17
- - rubenroy/GammaCorpus-v2-50k
18
- pipeline_tag: text-generation
19
- library_name: transformers
20
- ---
21
-
22
- ![Zunich Banner](https://cdn.ruben-roy.com/AI/Zurich/img/banner-14B-50k.png)
23
-
24
- # Zurich 14B GammaCorpus v2-50k
25
- *A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
26
-
27
- ## Overview
28
- Zurich 14B GammaCorpus v2-50k is a fine-tune of Alibaba's **Qwen 2.5 14B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-50k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-50k).
29
-
30
- ## Model Details
31
- - **Base Model:** [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
32
- - **Type:** Causal Language Models
33
- - **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
34
- - **Number of Parameters:** 14.7B
35
- - **Number of Paramaters (Non-Embedding):** 13.1B
36
- - **Number of Layers:** 48
37
- - **Number of Attention Heads (GQA):** 40 for Q and 8 for KV
38
-
39
- ## Training Details
40
-
41
- Zurich-14B-GCv2-50k underwent fine-tuning with 1 A100 GPU for ~20 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-14B-GCv2-50k was trained for **60 Epochs**.
42
-
43
- ## Usage
44
-
45
- ### Requirements
46
-
47
- We **strongly** recommend you use the latest version of the `transformers` package. You may install it via `pip` as follows:
48
-
49
- ```
50
- pip install transformers
51
- ```
52
-
53
- ### Quickstart
54
-
55
- Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents;
56
-
57
- ```python
58
- from transformers import AutoModelForCausalLM, AutoTokenizer
59
-
60
- model_name = "rubenroy/Zurich-14B-GCv2-50k"
61
-
62
- model = AutoModelForCausalLM.from_pretrained(
63
- model_name,
64
- torch_dtype="auto",
65
- device_map="auto"
66
- )
67
- tokenizer = AutoTokenizer.from_pretrained(model_name)
68
-
69
- prompt = "How tall is the Eiffel tower?"
70
- messages = [
71
- {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 14B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
72
- {"role": "user", "content": prompt}
73
- ]
74
- text = tokenizer.apply_chat_template(
75
- messages,
76
- tokenize=False,
77
- add_generation_prompt=True
78
- )
79
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
80
-
81
- generated_ids = model.generate(
82
- **model_inputs,
83
- max_new_tokens=512
84
- )
85
- generated_ids = [
86
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
87
- ]
88
-
89
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
90
- ```
91
-
92
- ## About GammaCorpus
93
-
94
- This model, and all Zurich models, are trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that is filled with structured and filtered multi-turn conversations.
95
- GammaCorpus has 4 version with different sizes in each. These are the following versions and sizes:
96
-
97
- ### GammaCorpus v1
98
- - 10k UNFILTERED
99
- - 50k UNFILTERED
100
- - 70k UNFILTERED
101
-
102
- Here is a link to the GCv1 dataset collection:<br>
103
- https://huggingface.co/collections/rubenroy/gammacorpus-v1-67935e4e52a04215f15a7a60
104
-
105
- ### GammaCorpus v2
106
- - 10k
107
- - **50k <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.**
108
- - 100k
109
- - 500k
110
- - 1m
111
- - 5m
112
-
113
- Here is a link to the GCv2 dataset collection:<br>
114
- https://huggingface.co/collections/rubenroy/gammacorpus-v2-67935e895e1259c404a579df
115
-
116
- ### GammaCorpus CoT
117
- - Math 170k
118
-
119
- Here is a link to the GC-CoT dataset collection:<br>
120
- https://huggingface.co/collections/rubenroy/gammacorpus-cot-6795bbc950b62b1ced41d14f
121
-
122
- ### GammaCorpus QA
123
- - Fact 450k
124
-
125
- Here is a link to the GC-QA dataset collection:<br>
126
- https://huggingface.co/collections/rubenroy/gammacorpus-qa-679857017bb3855234c1d8c7
127
-
128
- ### The link to the full GammaCorpus dataset collection can be found [here](https://huggingface.co/collections/rubenroy/gammacorpus-67765abf607615a0eb6d61ac).
129
-
130
- ## Known Limitations
131
-
132
- - **Bias:** We have tried our best to mitigate as much bias we can, but please be aware of the possibility that the model might generate some biased answers.
133
-
134
- ## Additional Information
135
-
136
- ### Licensing Information
137
-
 
 
 
 
 
 
 
 
 
 
 
 
138
  The model is released under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**. Please refer to the license for usage rights and restrictions.
 
1
+ ---
2
+ base_model: Qwen/Qwen2.5-14B-Instruct
3
+ tags:
4
+ - text-generation-inference
5
+ - transformers
6
+ - unsloth
7
+ - qwen2
8
+ - trl
9
+ - gammacorpus
10
+ - zurich
11
+ - chat
12
+ - conversational
13
+ license: apache-2.0
14
+ language:
15
+ - zho
16
+ - eng
17
+ - fra
18
+ - spa
19
+ - por
20
+ - deu
21
+ - ita
22
+ - rus
23
+ - jpn
24
+ - kor
25
+ - vie
26
+ - tha
27
+ - ara
28
+ datasets:
29
+ - rubenroy/GammaCorpus-v2-50k
30
+ pipeline_tag: text-generation
31
+ library_name: transformers
32
+ ---
33
+
34
+ ![Zunich Banner](https://cdn.ruben-roy.com/AI/Zurich/img/banner-14B-50k.png)
35
+
36
+ # Zurich 14B GammaCorpus v2-50k
37
+ *A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
38
+
39
+ ## Overview
40
+ Zurich 14B GammaCorpus v2-50k is a fine-tune of Alibaba's **Qwen 2.5 14B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-50k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-50k).
41
+
42
+ ## Model Details
43
+ - **Base Model:** [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
44
+ - **Type:** Causal Language Models
45
+ - **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
46
+ - **Number of Parameters:** 14.7B
47
+ - **Number of Paramaters (Non-Embedding):** 13.1B
48
+ - **Number of Layers:** 48
49
+ - **Number of Attention Heads (GQA):** 40 for Q and 8 for KV
50
+
51
+ ## Training Details
52
+
53
+ Zurich-14B-GCv2-50k underwent fine-tuning with 1 A100 GPU for ~20 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-14B-GCv2-50k was trained for **60 Epochs**.
54
+
55
+ ## Usage
56
+
57
+ ### Requirements
58
+
59
+ We **strongly** recommend you use the latest version of the `transformers` package. You may install it via `pip` as follows:
60
+
61
+ ```
62
+ pip install transformers
63
+ ```
64
+
65
+ ### Quickstart
66
+
67
+ Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents;
68
+
69
+ ```python
70
+ from transformers import AutoModelForCausalLM, AutoTokenizer
71
+
72
+ model_name = "rubenroy/Zurich-14B-GCv2-50k"
73
+
74
+ model = AutoModelForCausalLM.from_pretrained(
75
+ model_name,
76
+ torch_dtype="auto",
77
+ device_map="auto"
78
+ )
79
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
80
+
81
+ prompt = "How tall is the Eiffel tower?"
82
+ messages = [
83
+ {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 14B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
84
+ {"role": "user", "content": prompt}
85
+ ]
86
+ text = tokenizer.apply_chat_template(
87
+ messages,
88
+ tokenize=False,
89
+ add_generation_prompt=True
90
+ )
91
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
92
+
93
+ generated_ids = model.generate(
94
+ **model_inputs,
95
+ max_new_tokens=512
96
+ )
97
+ generated_ids = [
98
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
99
+ ]
100
+
101
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
102
+ ```
103
+
104
+ ## About GammaCorpus
105
+
106
+ This model, and all Zurich models, are trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that is filled with structured and filtered multi-turn conversations.
107
+ GammaCorpus has 4 version with different sizes in each. These are the following versions and sizes:
108
+
109
+ ### GammaCorpus v1
110
+ - 10k UNFILTERED
111
+ - 50k UNFILTERED
112
+ - 70k UNFILTERED
113
+
114
+ Here is a link to the GCv1 dataset collection:<br>
115
+ https://huggingface.co/collections/rubenroy/gammacorpus-v1-67935e4e52a04215f15a7a60
116
+
117
+ ### GammaCorpus v2
118
+ - 10k
119
+ - **50k <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.**
120
+ - 100k
121
+ - 500k
122
+ - 1m
123
+ - 5m
124
+
125
+ Here is a link to the GCv2 dataset collection:<br>
126
+ https://huggingface.co/collections/rubenroy/gammacorpus-v2-67935e895e1259c404a579df
127
+
128
+ ### GammaCorpus CoT
129
+ - Math 170k
130
+
131
+ Here is a link to the GC-CoT dataset collection:<br>
132
+ https://huggingface.co/collections/rubenroy/gammacorpus-cot-6795bbc950b62b1ced41d14f
133
+
134
+ ### GammaCorpus QA
135
+ - Fact 450k
136
+
137
+ Here is a link to the GC-QA dataset collection:<br>
138
+ https://huggingface.co/collections/rubenroy/gammacorpus-qa-679857017bb3855234c1d8c7
139
+
140
+ ### The link to the full GammaCorpus dataset collection can be found [here](https://huggingface.co/collections/rubenroy/gammacorpus-67765abf607615a0eb6d61ac).
141
+
142
+ ## Known Limitations
143
+
144
+ - **Bias:** We have tried our best to mitigate as much bias we can, but please be aware of the possibility that the model might generate some biased answers.
145
+
146
+ ## Additional Information
147
+
148
+ ### Licensing Information
149
+
150
  The model is released under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**. Please refer to the license for usage rights and restrictions.