Model Card for emre/gemma-7b-it-Turkish-Reasoning-FT-smol

Model Details

Model Name: emre/gemma-7b-it-Turkish-Reasoning-FT-smol
Developed by: Davut Emre Tasar
Base Model: unsloth/gemma-7b-it-bnb-4bit
License: Apache-2.0 (Note: The config mentions AFL-3.0, but the uploaded model specifies Apache-2.0; assuming Apache-2.0 as per the uploaded model card.)
Languages:
- Turkish (tr): Primary language for input and output.
- English (en): Used in dataset reasoning components.
Tags: text-generation-inference, transformers, unsloth, gemma, trl
Datasets: emre/finance-reasoning-turkish (Hugging Face dataset)
Fine-Tuning Method: Supervised Fine-Tuning (SFT) with LoRA (Low-Rank Adaptation)
Training Framework: Hugging Face TRL library with Unsloth

This model is a fine-tuned version of the instruction-tuned unsloth/gemma-7b-it-bnb-4bit model, adapted for advanced reasoning tasks in Turkish, focusing on finance-related question-answering. It leverages the emre/finance-reasoning-turkish dataset and was trained using efficient techniques to enhance performance on a single A100 GPU.

Model Usage

This model is designed for text generation, particularly for answering Turkish-language questions with reasoning components. It can be loaded and used with the Hugging Face transformers library. Below are instructions for loading the model and generating responses (code snippets provided separately).

Prerequisites

Python 3.11+
transformers library (version 4.49.0 recommended)
torch library with CUDA support for GPU usage
unsloth library (optional, for faster inference if compatible)

Loading the Model and Tokenizer

You can load the model and tokenizer from the Hugging Face Hub. The model uses 4-bit quantization, so ensure your environment supports bitsandbytes.

Generating Responses

The model expects input in a conversational format, typically with a user question and optional context (e.g., RAG content). It generates responses with reasoning steps, suitable for finance-related queries in Turkish.

Training Procedure

Dataset

Name: emre/finance-reasoning-turkish
Description: A custom dataset containing Turkish finance-related questions, answers, and reasoning components, likely generated or curated for advanced reasoning tasks.

Training Configuration

Optimizer: adamw_8bit
Learning Rate: 2e-4 (initial), with linear scheduling
Batch Size: Effective batch size of 8 (per_device_train_batch_size=2, gradient_accumulation_steps=4)
Epochs: 2
Warmup Steps: 5
Weight Decay: 0.01
Mixed Precision: FP16 (fp16=True)
LoRA Configuration:
- Rank (r): 16
- Alpha (lora_alpha): 16
- Dropout (lora_dropout): 0
- Target Modules: ["q_proj", "o_proj", "gate_proj", "v_proj", "up_proj", "down_proj", "k_proj"]
- Task Type: CAUSAL_LM
Hardware: Single A100 GPU (40GB)

Training Metrics

Total Steps: 464
Training Loss: 0.7098 (final), 0.8503 (average)
Gradient Norm: 0.7641
Runtime: ~~19,928 seconds (~~5.5 hours)
Samples per Second: 0.186
Steps per Second: 0.023
Total FLOPs: 267,448,136,380,698,620

The model was trained using the Hugging Face TRL library with Unsloth optimizations, achieving 2x faster training compared to standard methods.

Evaluation

Quantitative evaluation metrics (e.g., validation loss) are not explicitly provided in the config, but the training loss decreased to 0.7098 by the end of training, indicating convergence. For reasoning tasks, qualitative evaluation is recommended:

Assess the coherence and relevance of generated responses.
Check for correct reasoning steps in Turkish finance-related outputs.

Ethical Considerations

Bias: The model may inherit biases from the base model or dataset, particularly in financial contexts. Users should review outputs for fairness.
Intended Use: Designed for academic and research purposes. Not suitable for commercial financial advice due to licensing and potential limitations.
Limitations: High computational requirements (GPU recommended) and potential overfitting to the finance domain.

Additional Information

Repository: https://huggingface.co/emre/gemma-7b-it-Turkish-Reasoning-FT-smol
Dataset: https://huggingface.co/datasets/emre/finance-reasoning-turkish
Contact: [huggingface.co/emre]

Useage

1. Loading the Model and Tokenizer

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("emre/gemma-7b-it-Turkish-Reasoning-FT-smol")

# Load the model with 4-bit quantization
model = AutoModelForCausalLM.from_pretrained(
    "emre/gemma-7b-it-Turkish-Reasoning-FT-smol",
    torch_dtype=torch.float16,
    load_in_4bit=True
)

# Move to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

2. Generating Responses

# Define the input text (example in Turkish)
input_text = (
    "<user_question>Bankalar neden yüksek faiz oranları sunuyor?</user_question>\n"
    "<RAG Content>Bankalar, mevduat toplamak ve krediler için fon sağlamak amacıyla faiz oranlarını "
    "artırabilir. Yüksek faiz oranları, genellikle ekonomik belirsizlik veya enflasyon dönemlerinde "
    "görülür. Bu, yatırımcıları çekmek ve likiditeyi artırmak için bir stratejidir.</RAG Content>\n"
)

# Tokenize the input
inputs = tokenizer(input_text, return_tensors="pt").to(device)

# Generate response
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_length=500,  # Adjust as needed
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
        num_return_sequences=1
    )

# Decode and print the response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Citation

If you use this model, please cite it as:

@misc{emre_gemma_7b_it_turkish_reasoning_ft_smol,
  author = {Davut Emre Tasar},
  title = {Gemma-7B-IT Turkish Reasoning FT Smol},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/emre/gemma-7b-it-Turkish-Reasoning-FT-smol}}
}

emre
/

gemma-7b-it-Turkish-Reasoning-FT-smol