Transformers
Safetensors
Spanish
text-generation-inference
unsloth
qwen3
trl

Qwen3-8B Gastronomía Hispana DPO LoRA

A specialized culinary assistant for Hispanic gastronomy, fine-tuned with Direct Preference Optimization (DPO)

Model Description

This LoRA adapter transforms Qwen3-8B-Instruct into an expert culinary assistant specialized in Hispanic and Latino cuisine. The model has been fine-tuned using Direct Preference Optimization (DPO) to provide high-quality, culturally authentic responses about cooking techniques, ingredients, and traditional recipes from Spanish-speaking countries.

Key Features

  • 🥘 Specialized Knowledge: Expert-level understanding of Hispanic/Latino culinary traditions
  • 🔧 DPO Training: Enhanced response quality through preference optimization
  • 🌍 Cultural Authenticity: Respects traditional cooking methods and regional variations
  • 📚 Comprehensive Coverage: Ingredients, techniques, recipes, and cultural context
  • 🇪🇸 Spanish Language: Native Spanish culinary terminology and explanations

Base Model

  • Architecture: Qwen3-8B-unsloth-bnb-4bit
  • Quantization: 4-bit (BNB)
  • Chat Template: ChatML format
  • Context Length: 2,500 tokens

Training Details

DPO Configuration

  • Method: Direct Preference Optimization
  • Beta: 0.1 (KL regularization parameter)
  • Epochs: 3
  • Learning Rate: 5e-6
  • Scheduler: Cosine with 3% warmup

LoRA Configuration

  • Rank (r): 64
  • Alpha: 64
  • Dropout: 0.0
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • RSLoRA: Enabled for rank stabilization

Training Infrastructure

  • Batch Size: 32 (4 per device × 8 gradient accumulation steps)
  • Optimizer: AdamW 8-bit
  • Weight Decay: 0.01
  • Max Gradient Norm: 1.0
  • Training Time: ~5.4 hours

Dataset

  • Source: somosnlp-hackathon-2025/gastronomia-hispana-dpo
  • Size: 7,092 preference pairs
  • Split: 95% train, 5% evaluation
  • Format: DPO preference pairs (chosen vs rejected responses)

Usage

Loading the Model

from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template

# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA",
    max_seq_length=2500,
    dtype=None,
    load_in_4bit=True,
)

# Setup chat template
tokenizer = get_chat_template(
    tokenizer,
    chat_template="chatml",
    mapping={"role": "role", "content": "content", "user": "user", "assistant": "assistant"},
    map_eos_token=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

Inference Example

from transformers import TextStreamer

# Prepare conversation
messages = [
    {"role": "user", "content": "¿Cómo preparo el encebollado ecuatoriano tradicional?"}
]

# Tokenize input
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to("cuda")

# Generate response
text_streamer = TextStreamer(tokenizer)
_ = model.generate(
    input_ids=inputs, 
    streamer=text_streamer, 
    max_new_tokens=512, 
    use_cache=True,
    do_sample=True,
    temperature=0.7,
    top_p=0.95
)

Alternative Loading (HF Transformers)

from peft import AutoModelForPeftCausalLM
from transformers import AutoTokenizer

model = AutoModelForPeftCausalLM.from_pretrained(
    "somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA",
    load_in_4bit=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA"
)

Example Interactions

Ingredient Expertise

User: "¿Qué tipo de harina se utiliza para hacer los crackers de los canapés con crema de salmón?"

Assistant: "Para preparar los crackers de los canapés con crema de salmón, la receta especifica el uso de harina de fuerza. Esta harina, también conocida como harina fuerte o harina de panadería, es ideal para este tipo de preparaciones debido a su alto contenido de gluten..."

Technique Guidance

User: "¿Cuáles son los errores más comunes al preparar macarrones con repollo?"

Assistant: "Al preparar macarrones con repollo, un plato de origen eslavo... los errores más comunes incluyen: 1. Cocer en exceso los macarrones... 2. Cortar el repollo demasiado grueso..."

Cultural Context

User: "¿De dónde viene el Cheesecake japonés y qué lo hace único?"

Assistant: "El Cheesecake japonés, también conocido como tarta de queso que tiembla... tiene su origen en la fusión de la repostería occidental con la precisión y delicadeza de la cocina japonesa..."

Model Capabilities

Specialized Domains

  • Traditional Recipes: Step-by-step authentic preparation methods
  • Ingredient Knowledge: Properties, uses, and cultural significance
  • Cooking Techniques: International methods with cultural context
  • Substitutions: Appropriate alternatives respecting authenticity
  • Cultural Context: Historical and regional cooking traditions

Response Quality

  • Detailed Explanations: Comprehensive, technically accurate guidance
  • Cultural Sensitivity: Respects traditional methods and origins
  • Practical Tips: Real-world cooking advice and troubleshooting
  • Educational Value: Teaches both technique and cultural background

Performance Metrics

  • Training Loss: Converged effectively over 3 epochs
  • Memory Usage: ~10.3GB peak GPU memory during training
  • Inference Speed: 2x faster with Unsloth optimizations
  • Model Size: ~168M trainable parameters (2.4% of base model)

Limitations

  • Language: Primarily optimized for Spanish culinary content
  • Domain: Specialized for cooking/gastronomy (may not perform well on other topics)
  • Context: Limited to 2,500 tokens per conversation
  • Base Model: Inherits any limitations from Qwen3-8B-Instruct

Technical Requirements

  • GPU Memory: Minimum 8GB for inference, 12GB+ recommended for fine-tuning
  • CUDA: Compatible with CUDA 12.4+
  • Libraries: Unsloth, Transformers 4.52+, PEFT, TRL
  • Python: 3.8+

Training Environment

  • Hardware: NVIDIA L40S (44GB VRAM)
  • Framework: Unsloth 2025.5.10
  • Precision: BF16 training, 4-bit quantization
  • Optimization: Gradient checkpointing, 8-bit AdamW

Ethical Considerations

  • Cultural Respect: Trained to honor traditional cooking methods and cultural origins
  • Accuracy: Provides technically sound culinary advice
  • Safety: Includes appropriate food safety considerations
  • Authenticity: Prioritizes traditional techniques over convenience modifications

Citation

@misc{gastronomia-hispana-dpo-2025,
  title={Qwen3-8B Gastronomía Hispana DPO LoRA},
  author={SomosNLP Hackathon 2025 Team},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA}
}

Acknowledgments

  • SomosNLP: For organizing the hackathon and providing the platform
  • Unsloth: For efficient training optimizations
  • Dataset Contributors: For creating the Hispanic gastronomy preference dataset
  • Base Model: Mistral AI for the foundation model

License

This adapter is released under the same license as the base Qwen3-8B model. Please refer to the original model's licensing terms for commercial use.


Note: This model is designed for educational and culinary assistance purposes. Always follow proper food safety guidelines when cooking.

Uploaded model

  • Developed by: somosnlp-hackathon-2025
  • License: apache-2.0
  • Finetuned from model : unsloth/Qwen3-8B-unsloth-bnb-4bit

This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Finetuned
(87)
this model

Dataset used to train somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA

Collection including somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA