Dbmaxwell's picture
Update Usage Example
5ef87de verified
---
license: mit
datasets:
- SoAp9035/turkish_instructions
language:
- tr
base_model:
- google/gemma-3-270m-it-qat-q4_0-unquantized
pipeline_tag: text-generation
library_name: transformers
---
# Gemma 3 270M Turkish Instructions Fine-tuned
This model is a **fine-tuned version of Google Gemma 3 270M IT** trained on a **SoAp9035/turkish_instructions Dataset** using direct fine-tuning.
## Model Details
- **Base model:** `google/gemma-3-270m-it-qat-q4_0-unquantized`
- **Fine-tune dataset:** Turkish instruction-format dataset (`SoAp9035/turkish_instructions Dataset`) #Formatting Chat template for google/gemma-3-270m-it-qat-q4_0-unquantized
- **Fine-tune type:** Direct fine-tuning (Causal LM)
- **Precision:** Full precision / BF16 (BF16 used if GPU supports it)
- **Max token length:** 256
- **Batch size:** 2 (effective batch size = 8 with gradient accumulation)
- **Number of epochs:** 2
- **Optimizer:** AdamW
- **Scheduler:** Cosine learning rate
- **Evaluation:** Every 100 steps, best model selected based on `eval_loss`
## Usage Example
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_NAME = "Dbmaxwell/gemma3-270m-turkish-instructions"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.padding_side = "right"
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
model.eval()
def generate_response(prompt, max_new_tokens=200):
formatted_prompt = f"<bos><start_of_turn>user\n{prompt}<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device)
with torch.no_grad():
outputs = model.generate(
inputs.input_ids,
max_new_tokens=max_new_tokens,
temperature=0.3,
top_p=0.8,
do_sample=True,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
repetition_penalty=1.2,
no_repeat_ngram_size=3,
)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
return response.split("<end_of_turn>")[0].strip()
test_prompts = [
"Merhaba! Ben bir AI asistanım. Sana nasıl yardımcı olabilirim?",
"Python'da for döngüsü nasıl yazılır?",
"İstanbul Türkiye'nin en büyük şehridir. Kısa bilgi ver.",
"Makine öğrenmesi nedir? Basit açıklama yap.",
"5 artı 3 çarpı 2 kaçtır?",
"Türkiye'nin başkenti neresidir?"
]
for i, prompt in enumerate(test_prompts, 1):
print(f"\n{i} Question: {prompt}")
print(f"Answer: {generate_response(prompt, max_new_tokens=100)}")
print("-" * 60)
```