Upload folder using huggingface_hub

Browse files

Files changed (12) hide show

.gitattributes +1 -0
LICENSE +16 -0
README.md +72 -5
added_tokens.json +3 -0
chat_template.jinja +47 -0
config.json +54 -0
generation_config.json +11 -0
model.safetensors +3 -0
special_tokens_map.json +33 -0
tokenizer.json +3 -0
tokenizer.model +3 -0
tokenizer_config.json +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

LICENSE CHANGED Viewed

	@@ -0,0 +1,16 @@

+This model is a derivative of Google's Gemma-3-270m-it and is subject to the original Gemma Terms of Use. The dataset used for fine-tuning is licensed under Apache 2.0.
+---
+**A. Gemma Terms of Use**
+The use of this model is subject to the Gemma Terms of Use, which can be found here:
+https://ai.google.dev/gemma/terms
+**B. Apache 2.0 License (for the dataset)**
+The dataset `Josephgflowers/Finance-Instruct-500k` used to fine-tune this model is licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. You may obtain a copy of the License at:
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

README.md CHANGED Viewed

@@ -1,5 +1,72 @@
----
-license: other
-license_name: gemma-tou-apache-2.0
-license_link: LICENSE
----

+---
+license: other
+base_model: google/gemma-3-270m-it
+tags:
+- mlx
+- finance
+- gemma
+- instruction-tuning
+datasets:
+- Josephgflowers/Finance-Instruct-500k
+---
+# Gemma-3-270M - Fine-tuned for Financial Instructions
+This is a fine-tuned version of Google's `gemma-3-270m-it` model, adapted for financial instruction-following tasks.
+## Model Description
+This model was fine-tuned using the Apple MLX framework. The goal was to specialize the base model for financial reporting summary and decision-making assistance. It was trained on the `Josephgflowers/Finance-Instruct-500k` dataset.
+## Intended Use
+This model is intended for tasks related to the financial domain, such as:
+*   Answering questions about financial concepts.
+*   Summarizing financial reports.
+*   Following instructions based on financial data.
+## How to Use
+You can use this model with the `transformers` library just like any other standard Hugging Face model.
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_name = "tlgoa/tmr-ai-nano" # <-- IMPORTANT: Replace with your HF repo name
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
+# Note: Gemma 3 uses a specific chat template.
+# For single-turn inference, you can format it like this:
+prompt = "What is the difference between revenue and profit?"
+formatted_prompt = f"### User:\n{prompt}\n\n### Assistant:"
+inputs = tokenizer(formatted_prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=200)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+# Clean up the response to only show the assistant's part
+assistant_response = response.split("### Assistant:")[1].strip()
+print(assistant_response)
+```
+## Training Procedure
+### Dataset
+The model was fine-tuned on the `Josephgflowers/Finance-Instruct-500k` dataset. The data was preprocessed to fit the following format:
+```
+### User:
+{user_prompt}
+### Assistant:
+{assistant_response}
+```
+### Fine-tuning
+The model was fine-tuned directly (full parameter tuning) using an Adam optimizer. Due to challenges with LoRA implementation in the available MLX version, a full fine-tuning approach was chosen. The fine-tuned weights were originally saved in MLX's `.npz` format and subsequently converted back to Hugging Face `safetensors` format for distribution.
+## Licenses
+- **Base Model:** This model is based on Google's Gemma-3-270M, which is subject to the [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
+- **Dataset:** The training data from `Josephgflowers/Finance-Instruct-500k` is available under the Apache 2.0 License.

added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "<image_soft_token>": 262144
+}

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,47 @@

+{{ bos_token }}
+{%- if messages[0]['role'] == 'system' -%}
+    {%- if messages[0]['content'] is string -%}
+        {%- set first_user_prefix = messages[0]['content'] + '
+' -%}
+    {%- else -%}
+        {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
+' -%}
+    {%- endif -%}
+    {%- set loop_messages = messages[1:] -%}
+{%- else -%}
+    {%- set first_user_prefix = "" -%}
+    {%- set loop_messages = messages -%}
+{%- endif -%}
+{%- for message in loop_messages -%}
+    {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
+        {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
+    {%- endif -%}
+    {%- if (message['role'] == 'assistant') -%}
+        {%- set role = "model" -%}
+    {%- else -%}
+        {%- set role = message['role'] -%}
+    {%- endif -%}
+    {{ '<start_of_turn>' + role + '
+' + (first_user_prefix if loop.first else "") }}
+    {%- if message['content'] is string -%}
+        {{ message['content'] | trim }}
+    {%- elif message['content'] is iterable -%}
+        {%- for item in message['content'] -%}
+            {%- if item['type'] == 'image' -%}
+                {{ '<start_of_image>' }}
+            {%- elif item['type'] == 'text' -%}
+                {{ item['text'] | trim }}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- else -%}
+        {{ raise_exception("Invalid content type") }}
+    {%- endif -%}
+    {{ '<end_of_turn>
+' }}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    {{'<start_of_turn>model
+'}}
+{%- endif -%}

config.json ADDED Viewed

	@@ -0,0 +1,54 @@

+{
+  "_sliding_window_pattern": 6,
+  "architectures": [
+    "Gemma3ForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "attn_logit_softcapping": null,
+  "bos_token_id": 2,
+  "dtype": "float32",
+  "eos_token_id": 1,
+  "final_logit_softcapping": null,
+  "head_dim": 256,
+  "hidden_activation": "gelu_pytorch_tanh",
+  "hidden_size": 640,
+  "initializer_range": 0.02,
+  "intermediate_size": 2048,
+  "layer_types": [
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "full_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "full_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "sliding_attention",
+    "full_attention"
+  ],
+  "max_position_embeddings": 32768,
+  "model_type": "gemma3_text",
+  "num_attention_heads": 4,
+  "num_hidden_layers": 18,
+  "num_key_value_heads": 1,
+  "pad_token_id": 0,
+  "query_pre_attn_scalar": 256,
+  "rms_norm_eps": 1e-06,
+  "rope_local_base_freq": 10000.0,
+  "rope_scaling": null,
+  "rope_theta": 1000000.0,
+  "sliding_window": 512,
+  "transformers_version": "4.56.0",
+  "use_bidirectional_attention": false,
+  "use_cache": true,
+  "vocab_size": 262144
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,11 @@

+{
+  "cache_implementation": "hybrid",
+  "do_sample": true,
+  "eos_token_id": [
+    1,
+    106
+  ],
+  "top_k": 64,
+  "top_p": 0.95,
+  "transformers_version": "4.56.0"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:965e4ebcb453b4b02e04a1132d90484cae0de6828fa5e52acdcbd539a085c2b2
+size 1072419256

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "boi_token": "<start_of_image>",
+  "bos_token": {
+    "content": "<bos>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eoi_token": "<end_of_image>",
+  "eos_token": {
+    "content": "<eos>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "image_token": "<image_soft_token>",
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
+size 33384568

tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
+size 4689074

tokenizer_config.json ADDED Viewed

The diff for this file is too large to render. See raw diff