CalamitousFelicitousness commited on
Commit
25fa96c
·
verified ·
1 Parent(s): ea1b1d1

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,207 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - chat
7
+ pipeline_tag: text-generation
8
+ library_name: transformers
9
+ ---
10
+
11
+ # This repo contains the copy of the original quantized to FP8. Original: [anthracite-org/magnum-v4-72b](https://huggingface.co/anthracite-org/magnum-v4-72b)
12
+
13
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/658a46cbfb9c2bdfae75b3a6/ZmOOkB2QwItLmoqmnxNWO.png)
14
+
15
+
16
+ This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus.
17
+
18
+ experimental because trained on top of instruct; but turned out amazing; hence code named magnum-alter, the original model that kickstarted the v4 family
19
+
20
+ This model is fine-tuned on top of [Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct).
21
+
22
+ ## Prompting
23
+ A typical input would look like this:
24
+
25
+ ```py
26
+ <|im_start|>system
27
+ system prompt<|im_end|>
28
+ <|im_start|>user
29
+ Hi there!<|im_end|>
30
+ <|im_start|>assistant
31
+ Nice to meet you!<|im_end|>
32
+ <|im_start|>user
33
+ Can I ask a question?<|im_end|>
34
+ <|im_start|>assistant
35
+ ```
36
+
37
+ ## SillyTavern templates
38
+
39
+ Below are Instruct and Context templates for use within SillyTavern.
40
+
41
+ <details><summary>context template</summary>
42
+
43
+ ```yaml
44
+ {
45
+ "story_string": "<|im_start|>system\n{{#if system}}{{system}}\n{{/if}}{{#if wiBefore}}{{wiBefore}}\n{{/if}}{{#if description}}{{description}}\n{{/if}}{{#if personality}}{{char}}'s personality: {{personality}}\n{{/if}}{{#if scenario}}Scenario: {{scenario}}\n{{/if}}{{#if wiAfter}}{{wiAfter}}\n{{/if}}{{#if persona}}{{persona}}\n{{/if}}{{trim}}<|im_end|>\n",
46
+ "example_separator": "",
47
+ "chat_start": "",
48
+ "use_stop_strings": false,
49
+ "allow_jailbreak": false,
50
+ "always_force_name2": true,
51
+ "trim_sentences": false,
52
+ "include_newline": false,
53
+ "single_line": false,
54
+ "name": "Magnum ChatML"
55
+ }
56
+ ```
57
+
58
+ </details><br>
59
+ <details><summary>instruct template</summary>
60
+
61
+ ```yaml
62
+ {
63
+ "system_prompt": "Currently, your role is {{char}}, described in detail below. As {{char}}, continue the narrative exchange with {{user}}.\n\n<Guidelines>\n• Maintain the character persona but allow it to evolve with the story.\n• Be creative and proactive. Drive the story forward, introducing plotlines and events when relevant.\n• All types of outputs are encouraged; respond accordingly to the narrative.\n• Include dialogues, actions, and thoughts in each response.\n• Utilize all five senses to describe scenarios within {{char}}'s dialogue.\n• Use emotional symbols such as "!" and "~" in appropriate contexts.\n• Incorporate onomatopoeia when suitable.\n• Allow time for {{user}} to respond with their own input, respecting their agency.\n• Act as secondary characters and NPCs as needed, and remove them when appropriate.\n• When prompted for an Out of Character [OOC:] reply, answer neutrally and in plaintext, not as {{char}}.\n</Guidelines>\n\n<Forbidden>\n• Using excessive literary embellishments and purple prose unless dictated by {{char}}'s persona.\n• Writing for, speaking, thinking, acting, or replying as {{user}} in your response.\n• Repetitive and monotonous outputs.\n• Positivity bias in your replies.\n• Being overly extreme or NSFW when the narrative context is inappropriate.\n</Forbidden>\n\nFollow the instructions in <Guidelines></Guidelines>, avoiding the items listed in <Forbidden></Forbidden>.",
64
+ "input_sequence": "<|im_start|>user\n",
65
+ "output_sequence": "<|im_start|>assistant\n",
66
+ "last_output_sequence": "",
67
+ "system_sequence": "<|im_start|>system\n",
68
+ "stop_sequence": "<|im_end|>",
69
+ "wrap": false,
70
+ "macro": true,
71
+ "names": true,
72
+ "names_force_groups": true,
73
+ "activation_regex": "",
74
+ "system_sequence_prefix": "",
75
+ "system_sequence_suffix": "",
76
+ "first_output_sequence": "",
77
+ "skip_examples": false,
78
+ "output_suffix": "<|im_end|>\n",
79
+ "input_suffix": "<|im_end|>\n",
80
+ "system_suffix": "<|im_end|>\n",
81
+ "user_alignment_message": "",
82
+ "system_same_as_user": false,
83
+ "last_system_sequence": "",
84
+ "name": "Magnum ChatML"
85
+ }
86
+ ```
87
+
88
+ </details><br>
89
+
90
+ ## Axolotl config
91
+
92
+ <details><summary>See axolotl config</summary>
93
+
94
+ ```yaml
95
+ base_model: /workspace/data/models/Qwen2.5-72B-Instruct
96
+ model_type: AutoModelForCausalLM
97
+ tokenizer_type: AutoTokenizer
98
+
99
+ plugins:
100
+ - axolotl.integrations.liger.LigerPlugin
101
+ liger_rope: true
102
+ liger_rms_norm: true
103
+ liger_swiglu: true
104
+ liger_fused_linear_cross_entropy: true
105
+
106
+ load_in_8bit: false
107
+ load_in_4bit: false
108
+ strict: false
109
+
110
+ datasets:
111
+ - path: anthracite-org/c2_logs_32k_llama3_qwen2_v1.2
112
+ type: sharegpt
113
+ conversation: chatml
114
+ - path: anthracite-org/kalo-opus-instruct-22k-no-refusal
115
+ type: sharegpt
116
+ conversation: chatml
117
+ - path: lodrick-the-lafted/kalo-opus-instruct-3k-filtered
118
+ type: sharegpt
119
+ conversation: chatml
120
+ - path: anthracite-org/nopm_claude_writing_fixed
121
+ type: sharegpt
122
+ conversation: chatml
123
+ - path: anthracite-org/kalo_opus_misc_240827
124
+ type: sharegpt
125
+ conversation: chatml
126
+ - path: anthracite-org/kalo_misc_part2
127
+ type: sharegpt
128
+ conversation: chatml
129
+ #chat_template: chatml
130
+ shuffle_merged_datasets: true
131
+ #default_system_message: "You are an assistant that responds to the user."
132
+ dataset_prepared_path: /workspace/data/magnum-72b-data
133
+ val_set_size: 0.0
134
+ output_dir: /workspace/data/72b-fft-out
135
+
136
+ sequence_len: 32768
137
+ sample_packing: true
138
+ pad_to_sequence_len: true
139
+
140
+ adapter:
141
+ lora_model_dir:
142
+ lora_r:
143
+ lora_alpha:
144
+ lora_dropout:
145
+ lora_target_linear:
146
+ lora_fan_in_fan_out:
147
+
148
+ wandb_project: 72b-magnum-fft
149
+ wandb_entity:
150
+ wandb_watch:
151
+ wandb_name: alter-attempt-01
152
+ wandb_log_model:
153
+
154
+ gradient_accumulation_steps: 2
155
+ micro_batch_size: 1
156
+ num_epochs: 2
157
+ optimizer: adamw_bnb_8bit
158
+ lr_scheduler: cosine
159
+ learning_rate: 0.000004
160
+
161
+ train_on_inputs: false
162
+ group_by_length: false
163
+ bf16: auto
164
+ fp16:
165
+ tf32: false
166
+
167
+ gradient_checkpointing: true
168
+ early_stopping_patience:
169
+ resume_from_checkpoint:
170
+ local_rank:
171
+ logging_steps: 1
172
+ xformers_attention:
173
+ flash_attention: true
174
+
175
+ warmup_steps: 40
176
+ evals_per_epoch:
177
+ eval_table_size:
178
+ eval_max_new_tokens:
179
+ saves_per_epoch: 2
180
+ debug:
181
+ deepspeed: deepspeed_configs/zero3_bf16.json
182
+ weight_decay: 0.01
183
+ fsdp:
184
+ fsdp_config:
185
+ special_tokens:
186
+ ```
187
+ </details><br>
188
+
189
+ ## Credits
190
+ We'd like to thank [DoctorShotgun](https://huggingface.co/Doctor-Shotgun) for sponsoring the compute for this train.
191
+ We would also like to thank all members of Anthracite who made this finetune possible.
192
+
193
+ ## Datasets
194
+ - [anthracite-org/c2_logs_32k_llama3_qwen2_v1.2](https://huggingface.co/datasets/anthracite-org/c2_logs_32k_llama3_qwen2_v1.2)
195
+ - [anthracite-org/kalo-opus-instruct-22k-no-refusal](https://huggingface.co/datasets/anthracite-org/kalo-opus-instruct-22k-no-refusal)
196
+ - [lodrick-the-lafted/kalo-opus-instruct-3k-filtered](https://huggingface.co/datasets/lodrick-the-lafted/kalo-opus-instruct-3k-filtered)
197
+ - [anthracite-org/nopm_claude_writing_fixed](https://huggingface.co/datasets/anthracite-org/nopm_claude_writing_fixed)
198
+ - [anthracite-org/kalo_opus_misc_240827](https://huggingface.co/datasets/anthracite-org/kalo_opus_misc_240827)
199
+ - [anthracite-org/kalo_misc_part2](https://huggingface.co/datasets/anthracite-org/kalo_misc_part2)
200
+
201
+ ## Training
202
+ We used 8x mi300x GPUs graciously provided by [DoctorShotgun](https://huggingface.co/Doctor-Shotgun) for the full-parameter fine-tuning of the model.
203
+
204
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
205
+
206
+ ## Safety
207
+ ...
added_tokens.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</tool_call>": 151658,
3
+ "<tool_call>": 151657,
4
+ "<|box_end|>": 151649,
5
+ "<|box_start|>": 151648,
6
+ "<|endoftext|>": 151643,
7
+ "<|file_sep|>": 151664,
8
+ "<|fim_middle|>": 151660,
9
+ "<|fim_pad|>": 151662,
10
+ "<|fim_prefix|>": 151659,
11
+ "<|fim_suffix|>": 151661,
12
+ "<|im_end|>": 151645,
13
+ "<|im_start|>": 151644,
14
+ "<|image_pad|>": 151655,
15
+ "<|object_ref_end|>": 151647,
16
+ "<|object_ref_start|>": 151646,
17
+ "<|quad_end|>": 151651,
18
+ "<|quad_start|>": 151650,
19
+ "<|repo_name|>": 151663,
20
+ "<|video_pad|>": 151656,
21
+ "<|vision_end|>": 151653,
22
+ "<|vision_pad|>": 151654,
23
+ "<|vision_start|>": 151652
24
+ }
config.json ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/workspace/magnum-v4-72b",
3
+ "architectures": [
4
+ "Qwen2ForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "eos_token_id": 151645,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 8192,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 29568,
12
+ "max_position_embeddings": 32768,
13
+ "max_window_layers": 80,
14
+ "model_type": "qwen2",
15
+ "num_attention_heads": 64,
16
+ "num_hidden_layers": 80,
17
+ "num_key_value_heads": 8,
18
+ "quantization_config": {
19
+ "config_groups": {
20
+ "group_0": {
21
+ "input_activations": {
22
+ "actorder": null,
23
+ "block_structure": null,
24
+ "dynamic": true,
25
+ "group_size": null,
26
+ "num_bits": 8,
27
+ "observer": null,
28
+ "observer_kwargs": {},
29
+ "strategy": "token",
30
+ "symmetric": true,
31
+ "type": "float"
32
+ },
33
+ "output_activations": null,
34
+ "targets": [
35
+ "Linear"
36
+ ],
37
+ "weights": {
38
+ "actorder": null,
39
+ "block_structure": null,
40
+ "dynamic": false,
41
+ "group_size": null,
42
+ "num_bits": 8,
43
+ "observer": "minmax",
44
+ "observer_kwargs": {},
45
+ "strategy": "channel",
46
+ "symmetric": true,
47
+ "type": "float"
48
+ }
49
+ }
50
+ },
51
+ "format": "float-quantized",
52
+ "global_compression_ratio": 1.4635441523988788,
53
+ "ignore": [
54
+ "lm_head"
55
+ ],
56
+ "kv_cache_scheme": null,
57
+ "quant_method": "compressed-tensors",
58
+ "quantization_status": "compressed"
59
+ },
60
+ "rms_norm_eps": 1e-06,
61
+ "rope_scaling": null,
62
+ "rope_theta": 1000000.0,
63
+ "sliding_window": null,
64
+ "tie_word_embeddings": false,
65
+ "torch_dtype": "bfloat16",
66
+ "transformers_version": "4.45.2",
67
+ "use_cache": false,
68
+ "use_sliding_window": false,
69
+ "vocab_size": 152064
70
+ }
generation_config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 151643,
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 151645,
6
+ 151643
7
+ ],
8
+ "pad_token_id": 151643,
9
+ "repetition_penalty": 1.05,
10
+ "temperature": 0.7,
11
+ "top_k": 20,
12
+ "top_p": 0.8,
13
+ "transformers_version": "4.45.2"
14
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model-00001-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f549ecceb1a1519499de598ea34880f9fa872ffb313f95b32a0831419eb7af96
3
+ size 4882801352
model-00002-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:829a6dc024fd8e47e159bd5531ae4144fd3f2b7c688cc6750709461cf14fd2b3
3
+ size 4782749264
model-00003-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71dba2cefe2f75cbe2d3876f541b7528c9cfb70b49c94b16b55948cd7b3f5dd4
3
+ size 4873985992
model-00004-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b580a88fb53d4af7603effa18d64b5f1a8659f44a22e60b158bc9d929fb7ec7
3
+ size 4782749368
model-00005-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f5517b6fb76db90294e0af1746c895d68a490a97b494c96236c2f73e74f79ed
3
+ size 4873986032
model-00006-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cb65eed546963d68a4d4141d584548d07921d79a3c38c8dd8674385a9f24d1c3
3
+ size 4782749368
model-00007-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff4782e255a7c07ac0d8cd022474fb14229b50b2fd026876fa2fbfa7a822034d
3
+ size 4873986032
model-00008-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:805b48f75eca46103bd03b808ece1a48360ed856354ef111993c41c155355e19
3
+ size 4782749368
model-00009-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f6cd2e3dbbfcfd13d011b8154f16053aa038c6e56cad1061620fd4c632c5b70
3
+ size 4873986032
model-00010-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b840b1cf48da96fbad8e3f86b2ab94c88f94a0ea233554acc75c626c1d7ad5f3
3
+ size 4782749368
model-00011-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:99f34789690dd823e66a1c0e4aaeb9bb576f33397764e76ccd57368b75c43784
3
+ size 4873986032
model-00012-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e87e3ff237932880be83059cf7927ba0079759579ab25d42475a406797f31b81
3
+ size 4782749368
model-00013-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cd4c5181d18ead9fe2c61d51669eb7d954cba14abe65342a01cf0777e876d3ea
3
+ size 4873986032
model-00014-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef4bd042eed11ad192eb1f4112f86c8dc319cea153daed94f58216476e051ab0
3
+ size 4782749368
model-00015-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f56e69233f6ca781486c88f3702068b3594fdb5461b4323465f447c579366904
3
+ size 4873986032
model-00016-of-00016.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:17dd8257ec4cc00e7f30f49f90d3df119f86ae7d0a0a6f878963c230c65f62bd
3
+ size 2733703856
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|object_ref_start|>",
6
+ "<|object_ref_end|>",
7
+ "<|box_start|>",
8
+ "<|box_end|>",
9
+ "<|quad_start|>",
10
+ "<|quad_end|>",
11
+ "<|vision_start|>",
12
+ "<|vision_end|>",
13
+ "<|vision_pad|>",
14
+ "<|image_pad|>",
15
+ "<|video_pad|>"
16
+ ],
17
+ "eos_token": {
18
+ "content": "<|im_end|>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ },
24
+ "pad_token": {
25
+ "content": "<|endoftext|>",
26
+ "lstrip": false,
27
+ "normalized": false,
28
+ "rstrip": false,
29
+ "single_word": false
30
+ }
31
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c5ae00e602b8860cbd784ba82a8aa14e8feecec692e7076590d014d7b7fdafa
3
+ size 11421896
tokenizer_config.json ADDED
@@ -0,0 +1,207 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "151643": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "151644": {
14
+ "content": "<|im_start|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "151645": {
22
+ "content": "<|im_end|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "151646": {
30
+ "content": "<|object_ref_start|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "151647": {
38
+ "content": "<|object_ref_end|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "151648": {
46
+ "content": "<|box_start|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "151649": {
54
+ "content": "<|box_end|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "151650": {
62
+ "content": "<|quad_start|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "151651": {
70
+ "content": "<|quad_end|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "151652": {
78
+ "content": "<|vision_start|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "151653": {
86
+ "content": "<|vision_end|>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "151654": {
94
+ "content": "<|vision_pad|>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "151655": {
102
+ "content": "<|image_pad|>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "151656": {
110
+ "content": "<|video_pad|>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "151657": {
118
+ "content": "<tool_call>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "151658": {
126
+ "content": "</tool_call>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "151659": {
134
+ "content": "<|fim_prefix|>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "151660": {
142
+ "content": "<|fim_middle|>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "151661": {
150
+ "content": "<|fim_suffix|>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "151662": {
158
+ "content": "<|fim_pad|>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "151663": {
166
+ "content": "<|repo_name|>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": false
172
+ },
173
+ "151664": {
174
+ "content": "<|file_sep|>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": false
180
+ }
181
+ },
182
+ "additional_special_tokens": [
183
+ "<|im_start|>",
184
+ "<|im_end|>",
185
+ "<|object_ref_start|>",
186
+ "<|object_ref_end|>",
187
+ "<|box_start|>",
188
+ "<|box_end|>",
189
+ "<|quad_start|>",
190
+ "<|quad_end|>",
191
+ "<|vision_start|>",
192
+ "<|vision_end|>",
193
+ "<|vision_pad|>",
194
+ "<|image_pad|>",
195
+ "<|video_pad|>"
196
+ ],
197
+ "bos_token": null,
198
+ "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
199
+ "clean_up_tokenization_spaces": false,
200
+ "eos_token": "<|im_end|>",
201
+ "errors": "replace",
202
+ "model_max_length": 131072,
203
+ "pad_token": "<|endoftext|>",
204
+ "split_special_tokens": false,
205
+ "tokenizer_class": "Qwen2Tokenizer",
206
+ "unk_token": null
207
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff