Qwen3-8B Gastronomía Hispana DPO LoRA
A specialized culinary assistant for Hispanic gastronomy, fine-tuned with Direct Preference Optimization (DPO)
Model Description
This LoRA adapter transforms Qwen3-8B-Instruct into an expert culinary assistant specialized in Hispanic and Latino cuisine. The model has been fine-tuned using Direct Preference Optimization (DPO) to provide high-quality, culturally authentic responses about cooking techniques, ingredients, and traditional recipes from Spanish-speaking countries.
Key Features
- 🥘 Specialized Knowledge: Expert-level understanding of Hispanic/Latino culinary traditions
- 🔧 DPO Training: Enhanced response quality through preference optimization
- 🌍 Cultural Authenticity: Respects traditional cooking methods and regional variations
- 📚 Comprehensive Coverage: Ingredients, techniques, recipes, and cultural context
- 🇪🇸 Spanish Language: Native Spanish culinary terminology and explanations
Base Model
- Architecture: Qwen3-8B-unsloth-bnb-4bit
- Quantization: 4-bit (BNB)
- Chat Template: ChatML format
- Context Length: 2,500 tokens
Training Details
DPO Configuration
- Method: Direct Preference Optimization
- Beta: 0.1 (KL regularization parameter)
- Epochs: 3
- Learning Rate: 5e-6
- Scheduler: Cosine with 3% warmup
LoRA Configuration
- Rank (r): 64
- Alpha: 64
- Dropout: 0.0
- Target Modules:
q_proj
,k_proj
,v_proj
,o_proj
,gate_proj
,up_proj
,down_proj
- RSLoRA: Enabled for rank stabilization
Training Infrastructure
- Batch Size: 32 (4 per device × 8 gradient accumulation steps)
- Optimizer: AdamW 8-bit
- Weight Decay: 0.01
- Max Gradient Norm: 1.0
- Training Time: ~5.4 hours
Dataset
- Source:
somosnlp-hackathon-2025/gastronomia-hispana-dpo
- Size: 7,092 preference pairs
- Split: 95% train, 5% evaluation
- Format: DPO preference pairs (chosen vs rejected responses)
Usage
Loading the Model
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA",
max_seq_length=2500,
dtype=None,
load_in_4bit=True,
)
# Setup chat template
tokenizer = get_chat_template(
tokenizer,
chat_template="chatml",
mapping={"role": "role", "content": "content", "user": "user", "assistant": "assistant"},
map_eos_token=True,
)
# Enable inference mode
FastLanguageModel.for_inference(model)
Inference Example
from transformers import TextStreamer
# Prepare conversation
messages = [
{"role": "user", "content": "¿Cómo preparo el encebollado ecuatoriano tradicional?"}
]
# Tokenize input
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt",
).to("cuda")
# Generate response
text_streamer = TextStreamer(tokenizer)
_ = model.generate(
input_ids=inputs,
streamer=text_streamer,
max_new_tokens=512,
use_cache=True,
do_sample=True,
temperature=0.7,
top_p=0.95
)
Alternative Loading (HF Transformers)
from peft import AutoModelForPeftCausalLM
from transformers import AutoTokenizer
model = AutoModelForPeftCausalLM.from_pretrained(
"somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA",
load_in_4bit=True,
)
tokenizer = AutoTokenizer.from_pretrained(
"somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA"
)
Example Interactions
Ingredient Expertise
User: "¿Qué tipo de harina se utiliza para hacer los crackers de los canapés con crema de salmón?"
Assistant: "Para preparar los crackers de los canapés con crema de salmón, la receta especifica el uso de harina de fuerza. Esta harina, también conocida como harina fuerte o harina de panadería, es ideal para este tipo de preparaciones debido a su alto contenido de gluten..."
Technique Guidance
User: "¿Cuáles son los errores más comunes al preparar macarrones con repollo?"
Assistant: "Al preparar macarrones con repollo, un plato de origen eslavo... los errores más comunes incluyen: 1. Cocer en exceso los macarrones... 2. Cortar el repollo demasiado grueso..."
Cultural Context
User: "¿De dónde viene el Cheesecake japonés y qué lo hace único?"
Assistant: "El Cheesecake japonés, también conocido como tarta de queso que tiembla... tiene su origen en la fusión de la repostería occidental con la precisión y delicadeza de la cocina japonesa..."
Model Capabilities
Specialized Domains
- Traditional Recipes: Step-by-step authentic preparation methods
- Ingredient Knowledge: Properties, uses, and cultural significance
- Cooking Techniques: International methods with cultural context
- Substitutions: Appropriate alternatives respecting authenticity
- Cultural Context: Historical and regional cooking traditions
Response Quality
- Detailed Explanations: Comprehensive, technically accurate guidance
- Cultural Sensitivity: Respects traditional methods and origins
- Practical Tips: Real-world cooking advice and troubleshooting
- Educational Value: Teaches both technique and cultural background
Performance Metrics
- Training Loss: Converged effectively over 3 epochs
- Memory Usage: ~10.3GB peak GPU memory during training
- Inference Speed: 2x faster with Unsloth optimizations
- Model Size: ~168M trainable parameters (2.4% of base model)
Limitations
- Language: Primarily optimized for Spanish culinary content
- Domain: Specialized for cooking/gastronomy (may not perform well on other topics)
- Context: Limited to 2,500 tokens per conversation
- Base Model: Inherits any limitations from Qwen3-8B-Instruct
Technical Requirements
- GPU Memory: Minimum 8GB for inference, 12GB+ recommended for fine-tuning
- CUDA: Compatible with CUDA 12.4+
- Libraries: Unsloth, Transformers 4.52+, PEFT, TRL
- Python: 3.8+
Training Environment
- Hardware: NVIDIA L40S (44GB VRAM)
- Framework: Unsloth 2025.5.10
- Precision: BF16 training, 4-bit quantization
- Optimization: Gradient checkpointing, 8-bit AdamW
Ethical Considerations
- Cultural Respect: Trained to honor traditional cooking methods and cultural origins
- Accuracy: Provides technically sound culinary advice
- Safety: Includes appropriate food safety considerations
- Authenticity: Prioritizes traditional techniques over convenience modifications
Citation
@misc{gastronomia-hispana-dpo-2025,
title={Qwen3-8B Gastronomía Hispana DPO LoRA},
author={SomosNLP Hackathon 2025 Team},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/somosnlp-hackathon-2025/Qwen3-8B-gastronomia-hispana-dpo-LoRA}
}
Acknowledgments
- SomosNLP: For organizing the hackathon and providing the platform
- Unsloth: For efficient training optimizations
- Dataset Contributors: For creating the Hispanic gastronomy preference dataset
- Base Model: Mistral AI for the foundation model
License
This adapter is released under the same license as the base Qwen3-8B model. Please refer to the original model's licensing terms for commercial use.
Note: This model is designed for educational and culinary assistance purposes. Always follow proper food safety guidelines when cooking.
Uploaded model
- Developed by: somosnlp-hackathon-2025
- License: apache-2.0
- Finetuned from model : unsloth/Qwen3-8B-unsloth-bnb-4bit
This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.