TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 β Natural-Language-to-SQL
TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 is a 1.1 billion parameter model derived from TinyLlama/TinyLlama-1.1B-Chat-v1.0.
Using parameter-efficient LoRA fine-tuning and the new Apple-Silicon-native MLX framework, the model has been specialised to convert plain-English questions into syntactically correct SQL queries for relational databases.
After training, the LoRA adapters were merged (βfusedβ) into the base weights, so you only need this single checkpoint for inference.
ποΈ Key Facts
Property | Value |
---|---|
Base model | TinyLlama 1.1B Chat v1.0 |
Task | Natural-Language β SQL generation |
Fine-tuning method | Low-Rank Adaptation (LoRA) @ rank = 16 |
Training framework | MLX 0.8 + PEFT |
Hardware | MacBook Pro M4 Pro (20-core GPU) |
Checkpoint size | 2.1 GB (fp16, fused) |
License | Apache 2.0 |
β¨ Intended Use
- Interactive data exploration inside BI notebooks or chatbots.
- Customer-support analytics β empower non-SQL users to ask free-form questions.
- Education & demos showing how LoRA + MLX enables rapid on-device fine-tuning.
The model was trained on synthetic NL-SQL pairs for demo purposes. Do not deploy it in production for mission-critical SQL generation without additional evaluation on your own schema and security review.
π» Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
prompt = """\
### Database schema
table orders(id, customer_id, total, created_at)
table customers(id, name, country)
### Question
List total sales per country ordered by total descending."""
inputs = tok(prompt, return_tensors="pt")
sql_out = model.generate(**inputs, max_new_tokens=128)
print(tok.decode(sql_out[0], skip_special_tokens=True))
ποΈββοΈ Training Details
- Data β 10 K synthetic NL/SQL pairs auto-generated from the open-domain schema list, then manually spot-checked for correctness.
- Pre-processing β schema + question paired using the Text-to-SQL prompt pattern; SQL statements lower-cased; no anonymisation.
- Hyper-parameters
- batch size = 32 (gradient-accum = 4)
- learning-rate = 2 e-4 (cosine schedule)
- epochs = 3
- LoRA rank = 16, Ξ± = 32
- fp16 mixed-precision
Total GPU-hours β 5mins on Apple-Silicon.
π± Environmental Impact
LoRA fine-tuning on consumer Apple-Silicon is energy-efficient.
π οΈ Limitations & Biases
- Trained on a synthetic, limited dataset β may under-perform on real production schemas.
- Does not perform schema-linking; you must include the relevant schema in the prompt.
- SQL is not guaranteed to be safe; always validate queries before execution.
βοΈ Citation
@misc{mohanan2024tinyllama_sql_lora,
title = {TinyLlama-1.1B-Chat-LoRA-Fused-v1.0},
author = {Jerome Mohanan},
note = {Hugging Face repository: https://huggingface.co/jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0},
year = {2024}
}
π« Contact
Questions or feedback? Ping @jero2rome on Hugging Face or email [email protected].
- Downloads last month
- 3
Model tree for jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0