metadata
license: llama3.1
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
- llama-3.1
- sql
- fine-tuned
- agent
- unsloth
- text-generation
language:
- en
pipeline_tag: text-generation
datasets:
- custom
metrics:
- loss
Better SQL Agent - Llama 3.1 8B π
π― World-Class Training Results
- Final Loss: 0.0508 (π₯ 96.7% improvement from 1.53 starting loss)
- Training Duration: 8.9 hours
- Training Samples: 19,480 (SQL analytics + technical conversations)
- Hardware: NVIDIA A10G GPU (24GB VRAM)
- Framework: Unsloth optimization (2x speedup)
π Model Description
This is a high-performance fine-tuned version of Meta-Llama-3.1-8B-Instruct, specifically optimized for:
- π SQL query generation and optimization
- π Data analysis and insights
- π¬ Technical assistance and debugging
- π οΈ Tool-based workflows
The model achieved exceptional training results with 96.7% loss reduction, demonstrating superior learning efficiency.
π§ Training Configuration
- Base Model:
meta-llama/Llama-3.1-8B-Instruct - Training Method: LoRA (Low-Rank Adaptation)
- Rank: 16, Alpha: 32, Dropout: 0.05
- Quantization: 4-bit with BF16 training precision
- Context Length: 128K tokens (extended from base)
- Optimizer: AdamW with cosine scheduling
π Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load the fine-tuned model
model_name = "abhishekgahlot/better-sql-agent-llama"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
# Generate SQL query
prompt = """<|begin_of_text|><|start_header_id|>user<|end_header_id|>
Create a SQL query to find the top 5 customers by total revenue in 2024:
<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:],
skip_special_tokens=True)
print(response)
π Performance Metrics
| Metric | Value |
|---|---|
| Starting Loss | 1.53 |
| Final Loss | 0.0508 |
| Loss Reduction | 96.7% |
| Training Time | 8.9 hours |
| GPU Utilization | ~90% (A10G) |
| Memory Usage | 18-22GB VRAM |
π― Use Cases
- SQL Generation: Create complex queries from natural language
- Data Analysis: Generate insights and analytical queries
- Code Assistance: Debug and optimize SQL code
- Technical Support: Answer database and analytics questions
- Learning Aid: Explain SQL concepts and best practices
π Training Data
The model was trained on a curated dataset of 19,480 high-quality examples including:
- SQL query generation tasks
- Data analysis conversations
- Technical problem-solving dialogues
- Tool usage patterns and workflows
β‘ Optimization Features
- Unsloth Integration: 2x faster training and inference
- 4-bit Quantization: Reduced memory footprint
- Flash Attention: Optimized attention mechanism
- Mixed Precision: BF16 training for efficiency
π License
This model inherits the Llama 3.1 license from the base model. Please review the official license for usage terms.
π Acknowledgments
- Built with Unsloth for optimized training
- Based on Meta's Llama 3.1 8B Instruct model
- Trained on NVIDIA A10G GPU infrastructure
π Model Card Contact
For questions about this model, please open an issue in the repository or contact the model author.
π Achieved 96.7% loss reduction - A testament to high-quality training data and optimization!