austinbv commited on
Commit
7d7852a
·
verified ·
1 Parent(s): 3c7b8db

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +1 -0
  2. README.md +189 -0
  3. chat_template.json +3 -0
  4. config.json +356 -0
  5. generation_config.json +13 -0
  6. model-00001-of-00049.safetensors +3 -0
  7. model-00002-of-00049.safetensors +3 -0
  8. model-00003-of-00049.safetensors +3 -0
  9. model-00004-of-00049.safetensors +3 -0
  10. model-00005-of-00049.safetensors +3 -0
  11. model-00006-of-00049.safetensors +3 -0
  12. model-00007-of-00049.safetensors +3 -0
  13. model-00008-of-00049.safetensors +3 -0
  14. model-00009-of-00049.safetensors +3 -0
  15. model-00010-of-00049.safetensors +3 -0
  16. model-00011-of-00049.safetensors +3 -0
  17. model-00012-of-00049.safetensors +3 -0
  18. model-00013-of-00049.safetensors +3 -0
  19. model-00014-of-00049.safetensors +3 -0
  20. model-00015-of-00049.safetensors +3 -0
  21. model-00016-of-00049.safetensors +3 -0
  22. model-00017-of-00049.safetensors +3 -0
  23. model-00018-of-00049.safetensors +3 -0
  24. model-00019-of-00049.safetensors +3 -0
  25. model-00020-of-00049.safetensors +3 -0
  26. model-00021-of-00049.safetensors +3 -0
  27. model-00022-of-00049.safetensors +3 -0
  28. model-00023-of-00049.safetensors +3 -0
  29. model-00024-of-00049.safetensors +3 -0
  30. model-00025-of-00049.safetensors +3 -0
  31. model-00026-of-00049.safetensors +3 -0
  32. model-00027-of-00049.safetensors +3 -0
  33. model-00028-of-00049.safetensors +3 -0
  34. model-00029-of-00049.safetensors +3 -0
  35. model-00030-of-00049.safetensors +3 -0
  36. model-00031-of-00049.safetensors +3 -0
  37. model-00032-of-00049.safetensors +3 -0
  38. model-00033-of-00049.safetensors +3 -0
  39. model-00034-of-00049.safetensors +3 -0
  40. model-00035-of-00049.safetensors +3 -0
  41. model-00036-of-00049.safetensors +3 -0
  42. model-00037-of-00049.safetensors +3 -0
  43. model-00038-of-00049.safetensors +3 -0
  44. model-00039-of-00049.safetensors +3 -0
  45. model-00040-of-00049.safetensors +3 -0
  46. model-00041-of-00049.safetensors +3 -0
  47. model-00042-of-00049.safetensors +3 -0
  48. model-00043-of-00049.safetensors +3 -0
  49. model-00044-of-00049.safetensors +3 -0
  50. model-00045-of-00049.safetensors +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ language:
4
+ - ar
5
+ - de
6
+ - en
7
+ - es
8
+ - fr
9
+ - hi
10
+ - id
11
+ - it
12
+ - pt
13
+ - th
14
+ - tl
15
+ - vi
16
+ base_model:
17
+ - meta-llama/Llama-4-Scout-17B-16E
18
+ tags:
19
+ - facebook
20
+ - meta
21
+ - pytorch
22
+ - llama
23
+ - llama-4
24
+ - mlx
25
+ extra_gated_prompt: '**LLAMA 4 COMMUNITY LICENSE AGREEMENT**
26
+
27
+ Llama 4 Version Effective Date: April 5, 2025
28
+
29
+ "**Agreement**" means the terms and conditions for use, reproduction, distribution
30
+ and modification of the Llama Materials set forth herein.
31
+
32
+ "**Documentation**" means the specifications, manuals and documentation accompanying
33
+ Llama 4 distributed by Meta at [https://www.llama.com/docs/overview](https://llama.com/docs/overview).
34
+
35
+ "**Licensee**" or "**you**" means you, or your employer or any other person or entity
36
+ (if you are entering into this Agreement on such person or entity’s behalf), of
37
+ the age required under applicable laws, rules or regulations to provide legal consent
38
+ and that has legal authority to bind your employer or such other person or entity
39
+ if you are entering in this Agreement on their behalf.
40
+
41
+ "**Llama 4**" means the foundational large language models and software and algorithms,
42
+ including machine-learning model code, trained model weights, inference-enabling
43
+ code, training-enabling code, fine-tuning enabling code and other elements of the
44
+ foregoing distributed by Meta at [https://www.llama.com/llama-downloads](https://www.llama.com/llama-downloads).
45
+
46
+ "**Llama Materials**" means, collectively, Meta’s proprietary Llama 4 and Documentation
47
+ (and any portion thereof) made available under this Agreement.
48
+
49
+ "**Meta**" or "**we**" means Meta Platforms Ireland Limited (if you are located
50
+ in or, if you are an entity, your principal place of business is in the EEA or Switzerland)
51
+ and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland). 
52
+
53
+ By clicking "I Accept" below or by using or distributing any portion or element
54
+ of the Llama Materials, you agree to be bound by this Agreement.
55
+
56
+ 1\. **License Rights and Redistribution**.
57
+
58
+ a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable
59
+ and royalty-free limited license under Meta’s intellectual property or other rights
60
+ owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy,
61
+ create derivative works of, and make modifications to the Llama Materials.  
62
+
63
+ b. Redistribution and Use.  
64
+
65
+ i. If you distribute or make available the Llama Materials (or any derivative works
66
+ thereof), or a product or service (including another AI model) that contains any
67
+ of them, you shall (A) provide a copy of this Agreement with any such Llama Materials;
68
+ and (B) prominently display "Built with Llama" on a related website, user interface,
69
+ blogpost, about page, or product documentation. If you use the Llama Materials or
70
+ any outputs or results of the Llama Materials to create, train, fine tune, or otherwise
71
+ improve an AI model, which is distributed or made available, you shall also include
72
+ "Llama" at the beginning of any such AI model name.
73
+
74
+ ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee
75
+ as part of an integrated end user product, then Section 2 of this Agreement will
76
+ not apply to you. 
77
+
78
+ iii. You must retain in all copies of the Llama Materials that you distribute the
79
+ following attribution notice within a "Notice" text file distributed as a part of
80
+ such copies: "Llama 4 is licensed under the Llama 4 Community License, Copyright
81
+ © Meta Platforms, Inc. All Rights Reserved."
82
+
83
+ iv. Your use of the Llama Materials must comply with applicable laws and regulations
84
+ (including trade compliance laws and regulations) and adhere to the Acceptable Use
85
+ Policy for the Llama Materials (available at [https://www.llama.com/llama4/use-policy](https://www.llama.com/llama4/use-policy)),
86
+ which is hereby incorporated by reference into this Agreement.    2\. **Additional
87
+ Commercial Terms**. If, on the Llama 4 version release date, the monthly active
88
+ users of the products or services made available by or for Licensee, or Licensee’s
89
+ affiliates, is greater than 700 million monthly active users in the preceding calendar
90
+ month, you must request a license from Meta, which Meta may grant to you in its
91
+ sole discretion, and you are not authorized to exercise any of the rights under
92
+ this Agreement unless or until Meta otherwise expressly grants you such rights.
93
+
94
+ 3**. Disclaimer of Warranty**. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS
95
+ AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES
96
+ OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED,
97
+ INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY,
98
+ OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING
99
+ THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY
100
+ RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.
101
+
102
+ 4\. **Limitation of Liability**. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE
103
+ UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY,
104
+ OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT,
105
+ SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META
106
+ OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.
107
+
108
+ 5\. **Intellectual Property**.
109
+
110
+ a. No trademark licenses are granted under this Agreement, and in connection with
111
+ the Llama Materials, neither Meta nor Licensee may use any name or mark owned by
112
+ or associated with the other or any of its affiliates, except as required for reasonable
113
+ and customary use in describing and redistributing the Llama Materials or as set
114
+ forth in this Section 5(a). Meta hereby grants you a license to use "Llama" (the
115
+ "Mark") solely as required to comply with the last sentence of Section 1.b.i. You
116
+ will comply with Meta’s brand guidelines (currently accessible at [https://about.meta.com/brand/resources/meta/company-brand/](https://about.meta.com/brand/resources/meta/company-brand/)[)](https://en.facebookbrand.com/).
117
+ All goodwill arising out of your use of the Mark will inure to the benefit of Meta.
118
+
119
+ b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for
120
+ Meta, with respect to any derivative works and modifications of the Llama Materials
121
+ that are made by you, as between you and Meta, you are and will be the owner of
122
+ such derivative works and modifications.
123
+
124
+ c. If you institute litigation or other proceedings against Meta or any entity (including
125
+ a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or
126
+ Llama 4 outputs or results, or any portion of any of the foregoing, constitutes
127
+ infringement of intellectual property or other rights owned or licensable by you,
128
+ then any licenses granted to you under this Agreement shall terminate as of the
129
+ date such litigation or claim is filed or instituted. You will indemnify and hold
130
+ harmless Meta from and against any claim by any third party arising out of or related
131
+ to your use or distribution of the Llama Materials.
132
+
133
+ 6\. **Term and Termination**. The term of this Agreement will commence upon your
134
+ acceptance of this Agreement or access to the Llama Materials and will continue
135
+ in full force and effect until terminated in accordance with the terms and conditions
136
+ herein. Meta may terminate this Agreement if you are in breach of any term or condition
137
+ of this Agreement. Upon termination of this Agreement, you shall delete and cease
138
+ use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of
139
+ this Agreement. 
140
+
141
+ 7\. **Governing Law and Jurisdiction**. This Agreement will be governed and construed
142
+ under the laws of the State of California without regard to choice of law principles,
143
+ and the UN Convention on Contracts for the International Sale of Goods does not
144
+ apply to this Agreement. The courts of California shall have exclusive jurisdiction
145
+ of any dispute arising out of this Agreement.'
146
+ extra_gated_fields:
147
+ First Name: text
148
+ Last Name: text
149
+ Date of birth: date_picker
150
+ Country: country
151
+ Affiliation: text
152
+ Job title:
153
+ type: select
154
+ options:
155
+ - Student
156
+ - Research Graduate
157
+ - AI researcher
158
+ - AI developer/engineer
159
+ - Reporter
160
+ - Other
161
+ geo: ip_location
162
+ ? By clicking Submit below I accept the terms of the license and acknowledge that
163
+ the information I provide will be collected stored processed and shared in accordance
164
+ with the Meta Privacy Policy
165
+ : checkbox
166
+ extra_gated_description: The information you provide will be collected, stored, processed
167
+ and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
168
+ extra_gated_button_content: Submit
169
+ extra_gated_heading: Please be sure to provide your full legal name, date of birth,
170
+ and full organization name with all corporate identifiers. Avoid the use of acronyms
171
+ and special characters. Failure to follow these instructions may prevent you from
172
+ accessing this model and others on Hugging Face. You will not have the ability to
173
+ edit this form after submission, so please ensure all information is accurate.
174
+ license: other
175
+ license_name: llama4
176
+ ---
177
+
178
+ # mlx-community/meta-llama-Llama-4-Scout-17B-16E-Instruct-bf16
179
+ This model was converted to MLX format from [`meta-llama/Llama-4-Scout-17B-16E-Instruct`]() using mlx-vlm version **0.1.21**.
180
+ Refer to the [original model card](https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct) for more details on the model.
181
+ ## Use with mlx
182
+
183
+ ```bash
184
+ pip install -U mlx-vlm
185
+ ```
186
+
187
+ ```bash
188
+ python -m mlx_vlm.generate --model mlx-community/meta-llama-Llama-4-Scout-17B-16E-Instruct-bf16 --max-tokens 100 --temperature 0.0 --prompt "Describe this image." --image <path_to_image>
189
+ ```
chat_template.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "chat_template": "{{- bos_token }}\n{%- if custom_tools is defined %}\n {%- set tools = custom_tools %}\n{%- endif %}\n{%- if not tools_in_user_message is defined %}\n {%- set tools_in_user_message = true %}\n{%- endif %}\n{%- if not date_string is defined %}\n {%- if strftime_now is defined %}\n {%- set date_string = strftime_now(\"%d %b %Y\") %}\n {%- else %}\n {%- set date_string = \"26 Jul 2024\" %}\n {%- endif %}\n{%- endif %}\n{%- if not tools is defined %}\n {%- set tools = none %}\n{%- endif %}\n\n{#- This block extracts the system message, so we can slot it into the right place. #}\n{%- if messages[0]['role'] == 'system' %} \n {%- if messages[0]['content'] is string %}\n {%- set system_message = messages[0]['content']|trim %}\n {%- else %}\n {#- FIXME: The processor requires an array, always. #}\n {%- set system_message = messages[0]['content'][0]['text']|trim %}\n {%- endif %}\n {%- set messages = messages[1:] %}\n {%- set user_supplied_system_message = true %}\n{%- else %}\n {%- set system_message = \"\" %}\n {%- set user_supplied_system_message = false %}\n{%- endif %}\n\n{#- System message if the user supplied one #}\n{%- if user_supplied_system_message %}\n {{- \"<|header_start|>system<|header_end|>\\n\\n\" }}\n {%- if tools is not none %}\n {{- \"Environment: ipython\\n\" }}\n {%- endif %}\n {%- if tools is not none and not tools_in_user_message %}\n {{- \"You have access to the following functions. To call a function, please respond with JSON for a function call.\" }}\n {{- 'Respond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}.' }}\n {{- \"Do not use variables.\\n\\n\" }}\n {%- for t in tools %}\n {{- t | tojson(indent=4) }}\n {{- \"\\n\\n\" }}\n {%- endfor %}\n {%- endif %}\n {{- system_message }}\n {{- \"<|eot|>\" }}\n{%- endif %}\n\n{#- Custom tools are passed in a user message with some extra guidance #}\n{%- if tools_in_user_message and not tools is none %}\n {#- Extract the first user message so we can plug it in here #}\n {%- if messages | length != 0 %}\n {%- set first_user_message = messages[0]['content']|trim %}\n {%- set messages = messages[1:] %}\n {%- else %}\n {{- raise_exception(\"Cannot put tools in the first user message when there's no first user message!\") }}\n{%- endif %}\n {{- '<|header_start|>user<|header_end|>\\n\\n' -}}\n {{- \"Given the following functions, please respond with a JSON for a function call \" }}\n {{- \"with its proper arguments that best answers the given prompt.\\n\\n\" }}\n {{- 'Respond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}.' }}\n {{- \"Do not use variables.\\n\\n\" }}\n {%- for t in tools %}\n {{- t | tojson(indent=4) }}\n {{- \"\\n\\n\" }}\n {%- endfor %}\n {{- first_user_message + \"<|eot|>\"}}\n{%- endif %}\n\n{%- for message in messages %}\n {%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}\n {{- '<|header_start|>' + message['role'] + '<|header_end|>\\n\\n' }}\n {%- if message['content'] is string %}\n {{- message['content'] }}\n {%- else %}\n {%- for content in message['content'] %}\n {%- if content['type'] == 'image' %}\n {{- '<|image|>' }}\n {%- elif content['type'] == 'text' %}\n {{- content['text'] }}\n {%- endif %}\n {%- endfor %}\n {%- endif %}\n {{- \"<|eot|>\" }}\n {%- elif 'tool_calls' in message and message.tool_calls|length > 0 %}\n {{- '<|header_start|>assistant<|header_end|>\\n\\n' -}}\n {{- '<|python_start|>' }}\n {%- if message['content'] is string %}\n {{- message['content'] }}\n {%- else %}\n {%- for content in message['content'] %}\n {%- if content['type'] == 'image' %}\n {{- '<|image|>' }}\n {%- elif content['type'] == 'text' %}\n {{- content['text'] }}\n {%- endif %}\n {%- endfor %}\n {%- endif %}\n {{- '<|python_end|>' }}\n {%- for tool_call in message.tool_calls %}\n {{- '{\"name\": \"' + tool_call.function.name + '\", ' }}\n {{- '\"parameters\": ' }}\n {{- tool_call.function.arguments | tojson }}\n {{- \"}\" }}\n {%- endfor %}\n {{- \"<|eot|>\" }}\n {%- elif message.role == \"tool\" or message.role == \"ipython\" %}\n {{- \"<|header_start|>ipython<|header_end|>\\n\\n\" }}\n {%- if message.content is mapping or message.content is iterable %}\n {{- message.content | tojson }}\n {%- else %}\n {{- message.content }}\n {%- endif %}\n {{- \"<|eot|>\" }}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|header_start|>assistant<|header_end|>\\n\\n' }}\n{%- endif %}\n"
3
+ }
config.json ADDED
@@ -0,0 +1,356 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_attn_implementation_autoset": false,
3
+ "add_cross_attention": false,
4
+ "architectures": [
5
+ "Llama4ForConditionalGeneration"
6
+ ],
7
+ "bad_words_ids": null,
8
+ "begin_suppress_tokens": null,
9
+ "boi_token_index": 200080,
10
+ "bos_token_id": null,
11
+ "chunk_size_feed_forward": 0,
12
+ "cross_attention_hidden_size": null,
13
+ "decoder_start_token_id": null,
14
+ "diversity_penalty": 0.0,
15
+ "do_sample": false,
16
+ "early_stopping": false,
17
+ "encoder_no_repeat_ngram_size": 0,
18
+ "eoi_token_index": 200081,
19
+ "eos_token_id": null,
20
+ "exponential_decay_length_penalty": null,
21
+ "finetuning_task": null,
22
+ "forced_bos_token_id": null,
23
+ "forced_eos_token_id": null,
24
+ "id2label": {
25
+ "0": "LABEL_0",
26
+ "1": "LABEL_1"
27
+ },
28
+ "image_token_index": 200092,
29
+ "is_decoder": false,
30
+ "is_encoder_decoder": false,
31
+ "label2id": {
32
+ "LABEL_0": 0,
33
+ "LABEL_1": 1
34
+ },
35
+ "length_penalty": 1.0,
36
+ "max_length": 20,
37
+ "min_length": 0,
38
+ "model_type": "llama4",
39
+ "no_repeat_ngram_size": 0,
40
+ "num_beam_groups": 1,
41
+ "num_beams": 1,
42
+ "num_return_sequences": 1,
43
+ "output_attentions": false,
44
+ "output_hidden_states": false,
45
+ "output_scores": false,
46
+ "pad_token_id": null,
47
+ "prefix": null,
48
+ "problem_type": null,
49
+ "pruned_heads": {},
50
+ "remove_invalid_values": false,
51
+ "repetition_penalty": 1.0,
52
+ "return_dict": true,
53
+ "return_dict_in_generate": false,
54
+ "sep_token_id": null,
55
+ "suppress_tokens": null,
56
+ "task_specific_params": null,
57
+ "temperature": 1.0,
58
+ "text_config": {
59
+ "return_dict": true,
60
+ "output_hidden_states": false,
61
+ "output_attentions": false,
62
+ "torchscript": false,
63
+ "torch_dtype": "bfloat16",
64
+ "use_bfloat16": false,
65
+ "tf_legacy_loss": false,
66
+ "pruned_heads": {},
67
+ "tie_word_embeddings": false,
68
+ "chunk_size_feed_forward": 0,
69
+ "is_encoder_decoder": false,
70
+ "is_decoder": false,
71
+ "cross_attention_hidden_size": null,
72
+ "add_cross_attention": false,
73
+ "tie_encoder_decoder": false,
74
+ "max_length": 20,
75
+ "min_length": 0,
76
+ "do_sample": false,
77
+ "early_stopping": false,
78
+ "num_beams": 1,
79
+ "num_beam_groups": 1,
80
+ "diversity_penalty": 0.0,
81
+ "temperature": 1.0,
82
+ "top_k": 50,
83
+ "top_p": 1.0,
84
+ "typical_p": 1.0,
85
+ "repetition_penalty": 1.0,
86
+ "length_penalty": 1.0,
87
+ "no_repeat_ngram_size": 0,
88
+ "encoder_no_repeat_ngram_size": 0,
89
+ "bad_words_ids": null,
90
+ "num_return_sequences": 1,
91
+ "output_scores": false,
92
+ "return_dict_in_generate": false,
93
+ "forced_bos_token_id": null,
94
+ "forced_eos_token_id": null,
95
+ "remove_invalid_values": false,
96
+ "exponential_decay_length_penalty": null,
97
+ "suppress_tokens": null,
98
+ "begin_suppress_tokens": null,
99
+ "architectures": null,
100
+ "finetuning_task": null,
101
+ "id2label": {
102
+ "0": "LABEL_0",
103
+ "1": "LABEL_1"
104
+ },
105
+ "label2id": {
106
+ "LABEL_0": 0,
107
+ "LABEL_1": 1
108
+ },
109
+ "tokenizer_class": null,
110
+ "prefix": null,
111
+ "bos_token_id": 200000,
112
+ "pad_token_id": 200018,
113
+ "eos_token_id": [
114
+ 200001,
115
+ 200007,
116
+ 200008
117
+ ],
118
+ "sep_token_id": null,
119
+ "decoder_start_token_id": null,
120
+ "task_specific_params": null,
121
+ "problem_type": null,
122
+ "_name_or_path": "",
123
+ "_attn_implementation_autoset": true,
124
+ "attention_bias": false,
125
+ "for_llm_compressor": false,
126
+ "model_type": "llama4_text",
127
+ "attn_temperature_tuning": 4,
128
+ "attn_scale": 0.1,
129
+ "floor_scale": 8192,
130
+ "vocab_size": 202048,
131
+ "max_position_embeddings": 10485760,
132
+ "hidden_size": 5120,
133
+ "intermediate_size": 8192,
134
+ "intermediate_size_mlp": 16384,
135
+ "num_hidden_layers": 48,
136
+ "num_attention_heads": 40,
137
+ "rope_scaling": {
138
+ "factor": 8.0,
139
+ "high_freq_factor": 4.0,
140
+ "low_freq_factor": 1.0,
141
+ "original_max_position_embeddings": 8192,
142
+ "rope_type": "llama3"
143
+ },
144
+ "num_key_value_heads": 8,
145
+ "hidden_act": "silu",
146
+ "initializer_range": 0.02,
147
+ "rms_norm_eps": 1e-05,
148
+ "use_cache": true,
149
+ "rope_theta": 500000.0,
150
+ "attention_dropout": 0.0,
151
+ "head_dim": 128,
152
+ "use_qk_norm": true,
153
+ "num_experts_per_tok": 1,
154
+ "num_local_experts": 16,
155
+ "output_router_logits": false,
156
+ "router_aux_loss_coef": 0.001,
157
+ "router_jitter_noise": 0.0,
158
+ "no_rope_layers": [
159
+ 1,
160
+ 1,
161
+ 1,
162
+ 0,
163
+ 1,
164
+ 1,
165
+ 1,
166
+ 0,
167
+ 1,
168
+ 1,
169
+ 1,
170
+ 0,
171
+ 1,
172
+ 1,
173
+ 1,
174
+ 0,
175
+ 1,
176
+ 1,
177
+ 1,
178
+ 0,
179
+ 1,
180
+ 1,
181
+ 1,
182
+ 0,
183
+ 1,
184
+ 1,
185
+ 1,
186
+ 0,
187
+ 1,
188
+ 1,
189
+ 1,
190
+ 0,
191
+ 1,
192
+ 1,
193
+ 1,
194
+ 0,
195
+ 1,
196
+ 1,
197
+ 1,
198
+ 0,
199
+ 1,
200
+ 1,
201
+ 1,
202
+ 0,
203
+ 1,
204
+ 1,
205
+ 1,
206
+ 0
207
+ ],
208
+ "interleave_moe_layer_step": 1,
209
+ "moe_layers": [
210
+ 0,
211
+ 1,
212
+ 2,
213
+ 3,
214
+ 4,
215
+ 5,
216
+ 6,
217
+ 7,
218
+ 8,
219
+ 9,
220
+ 10,
221
+ 11,
222
+ 12,
223
+ 13,
224
+ 14,
225
+ 15,
226
+ 16,
227
+ 17,
228
+ 18,
229
+ 19,
230
+ 20,
231
+ 21,
232
+ 22,
233
+ 23,
234
+ 24,
235
+ 25,
236
+ 26,
237
+ 27,
238
+ 28,
239
+ 29,
240
+ 30,
241
+ 31,
242
+ 32,
243
+ 33,
244
+ 34,
245
+ 35,
246
+ 36,
247
+ 37,
248
+ 38,
249
+ 39,
250
+ 40,
251
+ 41,
252
+ 42,
253
+ 43,
254
+ 44,
255
+ 45,
256
+ 46,
257
+ 47
258
+ ],
259
+ "attention_chunk_size": 8192
260
+ },
261
+ "tf_legacy_loss": false,
262
+ "tie_encoder_decoder": false,
263
+ "tie_word_embeddings": false,
264
+ "tokenizer_class": null,
265
+ "top_k": 50,
266
+ "top_p": 1.0,
267
+ "torch_dtype": "bfloat16",
268
+ "torchscript": false,
269
+ "transformers_version": "4.51.0",
270
+ "typical_p": 1.0,
271
+ "use_bfloat16": false,
272
+ "vision_config": {
273
+ "hidden_size": 1408,
274
+ "hidden_act": "gelu",
275
+ "num_hidden_layers": 34,
276
+ "num_channels": 3,
277
+ "intermediate_size": 5632,
278
+ "image_size": 336,
279
+ "vision_output_dim": 4096,
280
+ "patch_size": 14,
281
+ "norm_eps": 1e-05,
282
+ "num_attention_heads": 16,
283
+ "initializer_range": 0.02,
284
+ "pixel_shuffle_ratio": 0.5,
285
+ "projector_input_dim": 4096,
286
+ "projector_output_dim": 4096,
287
+ "multi_modal_projector_bias": false,
288
+ "projector_dropout": 0.0,
289
+ "attention_dropout": 0.0,
290
+ "vision_feature_layer": -1,
291
+ "vision_feature_select_strategy": "default",
292
+ "rope_theta": 10000,
293
+ "return_dict": true,
294
+ "output_hidden_states": false,
295
+ "output_attentions": false,
296
+ "torchscript": false,
297
+ "torch_dtype": null,
298
+ "use_bfloat16": false,
299
+ "tf_legacy_loss": false,
300
+ "pruned_heads": {},
301
+ "tie_word_embeddings": true,
302
+ "chunk_size_feed_forward": 0,
303
+ "is_encoder_decoder": false,
304
+ "is_decoder": false,
305
+ "cross_attention_hidden_size": null,
306
+ "add_cross_attention": false,
307
+ "tie_encoder_decoder": false,
308
+ "max_length": 20,
309
+ "min_length": 0,
310
+ "do_sample": false,
311
+ "early_stopping": false,
312
+ "num_beams": 1,
313
+ "num_beam_groups": 1,
314
+ "diversity_penalty": 0.0,
315
+ "temperature": 1.0,
316
+ "top_k": 50,
317
+ "top_p": 1.0,
318
+ "typical_p": 1.0,
319
+ "repetition_penalty": 1.0,
320
+ "length_penalty": 1.0,
321
+ "no_repeat_ngram_size": 0,
322
+ "encoder_no_repeat_ngram_size": 0,
323
+ "bad_words_ids": null,
324
+ "num_return_sequences": 1,
325
+ "output_scores": false,
326
+ "return_dict_in_generate": false,
327
+ "forced_bos_token_id": null,
328
+ "forced_eos_token_id": null,
329
+ "remove_invalid_values": false,
330
+ "exponential_decay_length_penalty": null,
331
+ "suppress_tokens": null,
332
+ "begin_suppress_tokens": null,
333
+ "architectures": null,
334
+ "finetuning_task": null,
335
+ "id2label": {
336
+ "0": "LABEL_0",
337
+ "1": "LABEL_1"
338
+ },
339
+ "label2id": {
340
+ "LABEL_0": 0,
341
+ "LABEL_1": 1
342
+ },
343
+ "tokenizer_class": null,
344
+ "prefix": null,
345
+ "bos_token_id": null,
346
+ "pad_token_id": null,
347
+ "eos_token_id": null,
348
+ "sep_token_id": null,
349
+ "decoder_start_token_id": null,
350
+ "task_specific_params": null,
351
+ "problem_type": null,
352
+ "_name_or_path": "",
353
+ "_attn_implementation_autoset": true,
354
+ "model_type": "llama4_vision_model"
355
+ }
356
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 200000,
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 200001,
6
+ 200007,
7
+ 200008
8
+ ],
9
+ "pad_token_id": 200018,
10
+ "temperature": 0.6,
11
+ "top_p": 0.9,
12
+ "transformers_version": "4.51.0.dev0"
13
+ }
model-00001-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:191db7fbcfcee71ed768447164a4c37d46dfd8143473181537a4ecae04eeb4a7
3
+ size 5280912197
model-00002-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:494191a98f07040f240bbdb0b9ac442426a178fa3a52c8f297546b7eb3e717c5
3
+ size 4404205351
model-00003-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a870eca768075928e2779aabe92aedfa81d44f3e63f2575e5a112f767b1f4061
3
+ size 4404205373
model-00004-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f8ebce21aab1ad65aaba116a3c40c41a86e0efa1eee524f549c7ad6f0f8d97a1
3
+ size 4404205357
model-00005-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a7389e9b123c6e79d4fa51ad68743bac9b0a1f95980406ee768d5c34b6750a39
3
+ size 4404205373
model-00006-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1e9200b9501f1269ece05ba64fc39bbe40a1d5b2864988659fb54b639be1333a
3
+ size 4404205373
model-00007-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cdb00c1f42b8295b99b698bfc16a28c54557321ff3c9974c994ea2434387ae3d
3
+ size 4404205373
model-00008-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:83886981cfd2c159a0b8867ca052437d429c6d5d2ad0e681b672f28356c0468c
3
+ size 4404205359
model-00009-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9942af831d53db7d6482f2818469e57e7832664629b9a51711784b3d76c9b8d4
3
+ size 4404205373
model-00010-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b83da7068af139fa285c26fc39da5857cbbdd1384000be23dc6cb3b56445d663
3
+ size 4404205373
model-00011-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a7ca9b84504bdae880144cf9a1ed7c9829a94114c99d3752b35a6352a92b839c
3
+ size 4404205358
model-00012-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bef077b73233558956ca23f5d9506cedc7d7ec97705db40efa9a4606722c4388
3
+ size 4404205386
model-00013-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:88613dece965dbafe2906516754141a7e666fcc418530e5f69436df0c1676fc9
3
+ size 4404205386
model-00014-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a6c0bda272f2c3027e9e7cda25e4c6f0bbf87230372193c615d14190f3b11bf
3
+ size 4404205386
model-00015-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:becde0c15df82a086581c5c52816b35008c9e663e0bb923bc7828de1e80a7a50
3
+ size 4404205386
model-00016-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7c40bfae5315fb5517e481d2577c796f9b5527ff4e6f37c5741997ac9c654810
3
+ size 4404205380
model-00017-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b315d906dbeb113bf0eb1f1584fa4fa6d4c6bf6aa465bfb2f8a1334c84ff618c
3
+ size 4404205386
model-00018-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa92a6ebdd31e0453f6f2347263885b8351363a487f63deb425758b0c035b2e3
3
+ size 4404205386
model-00019-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:62c03211a7f7e065d44cc6691edaadacf1f81a17cbef4208bd0e67bcf7396eb4
3
+ size 4404205386
model-00020-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:85d62ec1a1176017fec3ae9346ade0249043ef29a5ea582ddb945df684d3f2fb
3
+ size 4404205386
model-00021-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e538ef41827dbd85c80938495ae3b2391742a524447e1ed8ec1761331b6000c6
3
+ size 4404205382
model-00022-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3545937f6081e05843a83bcae405e760f3a196a9669f50dd85e44b516eeccbe0
3
+ size 4404205386
model-00023-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc12ca2eef3d546d73b7e57651b2db413404325535bdaaedffce334ac6f03328
3
+ size 4404205386
model-00024-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d0dd167513ac6e1cffcebb07bb4b4749906174fcfd192bf5b946d93d754da39
3
+ size 4404205386
model-00025-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2bb4c7f75463cfac2b57aff83a62aa4cd445ae999b630664720387a28418ce9d
3
+ size 4404205378
model-00026-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff3a8e09d08aa341582cc488f8271f866328bae21c8e3b65326e197aa81793df
3
+ size 4404205378
model-00027-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ec61d2cc483fba8af179540c9f2e74d2d907f1f7b29d99033c01773b4e65a1f
3
+ size 4404205386
model-00028-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7c9e7c630e6f230592beadd92939b81fc09dc9f2c0b4c04dc50e9e13b381dca3
3
+ size 4404205386
model-00029-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f82aa4740cc2958a2a96b5916bc23332e82965de3a87685d9513733a60024af8
3
+ size 4404205378
model-00030-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa415d1fd8d29687b1a8251daf5ed0ccd7d3e752e54f78ac63102f204f1aa442
3
+ size 4404205386
model-00031-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f222dee9e5057e031d811bd21cc1e05851eda9b9e05e029ed64b46c7d2063072
3
+ size 4404205386
model-00032-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5f648690dc71a414d053d273afe389eac9250aefe3eb5966e33d8d67888f317
3
+ size 4404205386
model-00033-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5aff1948dcc6c4de251ab136527d1c9de8a294a3916c9d141682a54f3e73d2e1
3
+ size 4404205386
model-00034-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b98c2a895425dd759f82809a498d5eb4551afbed3115fc49bd782286bcb5532
3
+ size 4404205386
model-00035-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9058b630e025d0d454a71dc8547dd4101d2ef7ed98ca5747aef6f97f040c451d
3
+ size 4404205386
model-00036-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6fd22c61bce5bfaa29a39e52fb5bff8d7a9f9a3b99d303f10d3e0c4e458bd69e
3
+ size 4404205374
model-00037-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d587d16a111270f9fee6e48ffe9fb556218f34fca7bab687f1faf97f9faf5b9
3
+ size 4404205386
model-00038-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:90d6865bcef1d971100b5e827baeddfd2a37947f2d6271b7b3228defee3634c3
3
+ size 4404205386
model-00039-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ea327ba4ea389a117b2c97caa877e4805fb4d9da420fc317984a61a2474ac13
3
+ size 4404205386
model-00040-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:59f9691c11726db0d916194cae9a95a8ea0f0234bb6542bad232a44e29e5d6a4
3
+ size 4404205386
model-00041-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b93e273765cba5a96f608ddcc81a1e4473488ae52cf1fd8039839964d96671ba
3
+ size 4404205386
model-00042-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:695bf4ac9dffc7fa93618784147ee17256d87af0d4c042d00e2607042f48038b
3
+ size 4404205364
model-00043-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fb9c3ca119030a35b0da7f243d9d07ae5ce8dd6ded644c999353d944f3cb6c01
3
+ size 4404205374
model-00044-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:602d5db956e67da207e3791e7208d011cba78df7832e34ea446463ac667aa44b
3
+ size 4404205386
model-00045-of-00049.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba416e7fae93d870bffb68790b3ba2c76279bdde0452f04bbdc2ca1805c092f5
3
+ size 4404205386