sweelol
/

pt-gemma3-270m-dolly

@@ -1,55 +1,104 @@
 ---
 base_model: google/gemma-3-270m
 library_name: peft
 pipeline_tag: text-generation
 tags:
 - base_model:adapter:google/gemma-3-270m
 - transformers
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 ## Evaluation Results

 ---
+license: apache-2.0
+datasets:
+- databricks/databricks-dolly-15k
 base_model: google/gemma-3-270m
 library_name: peft
 pipeline_tag: text-generation
 tags:
 - base_model:adapter:google/gemma-3-270m
 - transformers
+- google
+- gemma3
+- prompt-tune
+- sweelol-ai
+- peft
 ---
+# sweelol/pt-gemma3-270m-dolly
+This model is part of the **Sweelol AI Hub**, a research project focused on efficient fine-tuning of modern language models on Kaggle accelerators.
+**Full Research Notebook & Benchmark Results:** [Coming soon]
+This model is part of the **Sweelol AI Hub** collection, resulting from experiments in efficient fine-tuning, optimization strategies and knowledge distillation on the Gemma-3-270m architecture using the Databricks Dolly-15k dataset on Kaggle TPUs/GPUs.
+This is a **LoRA-adapted** version of the `google/gemma-3-270m` model. It was fine-tuned on the Databricks Dolly-15k dataset using the **Low-Rank Adaptation (LoRA)** technique. LoRA is a parameter-efficient fine-tuning method that freezes the original model weights and injects trainable low-rank matrices into the attention layers. This allows the model to learn task-specific knowledge (instruction following) while keeping the overall number of trainable parameters low. Only the LoRA adapter weights need to be stored, making this model highly efficient to deploy.
+- **Developed by:** SweeLOL-ai
+- **Shared by:** SweeLOL ai
+- **Model type:** Causal Language Model
+- **Language(s) (NLP):** English
+- **License:** Apache 2.0
+- **Base Model:** `google/gemma-3-270m`
+### Description
+Gemma is a family of lightweight, state-of-the-art open models from Google,
+built from the same research and technology used to create the Gemini models.
+Gemma 3 models are multimodal, handling text and image input and generating text
+output, with open weights for both pre-trained variants and instruction-tuned
+variants. Gemma 3 has a large, 128K context window, multilingual support in over
+140 languages, and is available in more sizes than previous versions. Gemma 3
+models are well-suited for a variety of text generation and image understanding
+tasks, including question answering, summarization, and reasoning. Their
+relatively small size makes it possible to deploy them in environments with
+limited resources such as laptops, desktops or your own cloud infrastructure,
+democratizing access to state of the art AI models and helping foster innovation
+for everyone.
+### Usage
+Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0.
+```sh
+$ pip install -U transformers
+```
+Then, copy the snippet from the section that is relevant for your use case.
+#### Running with the `pipeline` API
+```python
+from transformers import pipeline
+import torch
+pipe = pipeline("text-generation", model="sweelol/pt-gemma3-270m-dolly", device="cuda", torch_dtype=torch.bfloat16)
+output = pipe("Eiffel tower is located in", max_new_tokens=50)
+```
+#### Running the model on a single / multi GPU
+```python
+import torch
+from transformers import AutoTokenizer,
+tokenizer = AutoTokenizer.from_pretrained("sweelol/pt-gemma3-270m-dolly")
+    if tokenizer.pad_token is None:
+        tokenizer.pad_token = tokenizer.eos_token
+        print("✅ Set tokenizer pad_token to eos_token")
+    model = AutoModelForCausalLM.from_pretrained(
+        "sweelol/lora-gemma3-270m-dolly",
+        torch_dtype=torch.bfloat16 if not USE_AMP else torch.float32,
+        attn_implementation='eager'
+    )
+    print(f"✅ Base model loaded (dtype: {model.dtype}).")
+prompt = "Eiffel tower is located in"
+model_inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+input_len = model_inputs["input_ids"].shape[-1]
+with torch.inference_mode():
+    generation = model.generate(**model_inputs, max_new_tokens=50, do_sample=False)
+    generation = generation[0][input_len:]
+decoded = tokenizer.decode(generation, skip_special_tokens=True)
+print(decoded)
+```
 ## Evaluation Results