Qwen3-8B Gastronomía Hispana DPO LoRA

A specialized culinary assistant for Hispanic gastronomy, fine-tuned with Direct Preference Optimization (DPO)

Model Description

This LoRA adapter transforms Qwen3-8B-Instruct into an expert culinary assistant specialized in Hispanic and Latino cuisine. The model has been fine-tuned using Direct Preference Optimization (DPO) to provide high-quality, culturally authentic responses about cooking techniques, ingredients, and traditional recipes from Spanish-speaking countries.

Key Features

🥘 Specialized Knowledge: Expert-level understanding of Hispanic/Latino culinary traditions
🔧 DPO Training: Enhanced response quality through preference optimization
🌍 Cultural Authenticity: Respects traditional cooking methods and regional variations
📚 Comprehensive Coverage: Ingredients, techniques, recipes, and cultural context
🇪🇸 Spanish Language: Native Spanish culinary terminology and explanations

Base Model

Architecture: Qwen3-8B-unsloth-bnb-4bit
Quantization: 4-bit (BNB)
Chat Template: ChatML format
Context Length: 2,500 tokens

Training Details

DPO Configuration

Method: Direct Preference Optimization
Beta: 0.1 (KL regularization parameter)
Epochs: 3
Learning Rate: 5e-6
Scheduler: Cosine with 3% warmup

LoRA Configuration

Rank (r): 64
Alpha: 64
Dropout: 0.0
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
RSLoRA: Enabled for rank stabilization

Training Infrastructure

Batch Size: 32 (4 per device × 8 gradient accumulation steps)
Optimizer: AdamW 8-bit
Weight Decay: 0.01
Max Gradient Norm: 1.0
Training Time: ~5.4 hours

Dataset

Source: somosnlp-hackathon-2025/gastronomia-hispana-dpo
Size: 7,092 preference pairs
Split: 95% train, 5% evaluation
Format: DPO preference pairs (chosen vs rejected responses)

Usage

Loading the Model

from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template

# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA",
    max_seq_length=2500,
    dtype=None,
    load_in_4bit=True,
)

# Setup chat template
tokenizer = get_chat_template(
    tokenizer,
    chat_template="chatml",
    mapping={"role": "role", "content": "content", "user": "user", "assistant": "assistant"},
    map_eos_token=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

Inference Example

from transformers import TextStreamer

# Prepare conversation
messages = [
    {"role": "user", "content": "¿Cómo preparo el encebollado ecuatoriano tradicional?"}
]

# Tokenize input
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to("cuda")

# Generate response
text_streamer = TextStreamer(tokenizer)
_ = model.generate(
    input_ids=inputs, 
    streamer=text_streamer, 
    max_new_tokens=512, 
    use_cache=True,
    do_sample=True,
    temperature=0.7,
    top_p=0.95
)

Alternative Loading (HF Transformers)

from peft import AutoModelForPeftCausalLM
from transformers import AutoTokenizer

model = AutoModelForPeftCausalLM.from_pretrained(
    "somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA",
    load_in_4bit=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA"
)

Example Interactions

Ingredient Expertise

User: "¿Qué tipo de harina se utiliza para hacer los crackers de los canapés con crema de salmón?"

Assistant: "Para preparar los crackers de los canapés con crema de salmón, la receta especifica el uso de harina de fuerza. Esta harina, también conocida como harina fuerte o harina de panadería, es ideal para este tipo de preparaciones debido a su alto contenido de gluten..."

Technique Guidance

User: "¿Cuáles son los errores más comunes al preparar macarrones con repollo?"

Assistant: "Al preparar macarrones con repollo, un plato de origen eslavo... los errores más comunes incluyen: 1. Cocer en exceso los macarrones... 2. Cortar el repollo demasiado grueso..."

Cultural Context

User: "¿De dónde viene el Cheesecake japonés y qué lo hace único?"

Assistant: "El Cheesecake japonés, también conocido como tarta de queso que tiembla... tiene su origen en la fusión de la repostería occidental con la precisión y delicadeza de la cocina japonesa..."

Model Capabilities

Specialized Domains

Traditional Recipes: Step-by-step authentic preparation methods
Ingredient Knowledge: Properties, uses, and cultural significance
Cooking Techniques: International methods with cultural context
Substitutions: Appropriate alternatives respecting authenticity
Cultural Context: Historical and regional cooking traditions

Response Quality

Detailed Explanations: Comprehensive, technically accurate guidance
Cultural Sensitivity: Respects traditional methods and origins
Practical Tips: Real-world cooking advice and troubleshooting
Educational Value: Teaches both technique and cultural background

Performance Metrics

Training Loss: Converged effectively over 3 epochs
Memory Usage: ~10.3GB peak GPU memory during training
Inference Speed: 2x faster with Unsloth optimizations
Model Size: ~168M trainable parameters (2.4% of base model)

Limitations

Language: Primarily optimized for Spanish culinary content
Domain: Specialized for cooking/gastronomy (may not perform well on other topics)
Context: Limited to 2,500 tokens per conversation
Base Model: Inherits any limitations from Qwen3-8B-Instruct

Technical Requirements

GPU Memory: Minimum 8GB for inference, 12GB+ recommended for fine-tuning
CUDA: Compatible with CUDA 12.4+
Libraries: Unsloth, Transformers 4.52+, PEFT, TRL
Python: 3.8+

Training Environment

Hardware: NVIDIA L40S (44GB VRAM)
Framework: Unsloth 2025.5.10
Precision: BF16 training, 4-bit quantization
Optimization: Gradient checkpointing, 8-bit AdamW

Ethical Considerations

Cultural Respect: Trained to honor traditional cooking methods and cultural origins
Accuracy: Provides technically sound culinary advice
Safety: Includes appropriate food safety considerations
Authenticity: Prioritizes traditional techniques over convenience modifications

Citation

@misc{gastronomia-hispana-dpo-2025,
  title={Qwen3-8B Gastronomía Hispana DPO LoRA},
  author={SomosNLP Hackathon 2025 Team},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA}
}

Acknowledgments

SomosNLP: For organizing the hackathon and providing the platform
Unsloth: For efficient training optimizations
Dataset Contributors: For creating the Hispanic gastronomy preference dataset
Base Model: Mistral AI for the foundation model

License

This adapter is released under the same license as the base Qwen3-8B model. Please refer to the original model's licensing terms for commercial use.

Note: This model is designed for educational and culinary assistance purposes. Always follow proper food safety guidelines when cooking.

Uploaded model

Developed by: somosnlp-hackathon-2025
License: apache-2.0
Finetuned from model : unsloth/Qwen3-8B-unsloth-bnb-4bit

This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

somosnlp-hackathon-2025
/

Qwen3-8B-gastronomia-hispana-dpo-LoRA