TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 β€” Natural-Language-to-SQL

TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 is a 1.1 billion parameter model derived from TinyLlama/TinyLlama-1.1B-Chat-v1.0.
Using parameter-efficient LoRA fine-tuning and the new Apple-Silicon-native MLX framework, the model has been specialised to convert plain-English questions into syntactically correct SQL queries for relational databases.
After training, the LoRA adapters were merged (β€œfused”) into the base weights, so you only need this single checkpoint for inference.


πŸ—οΈ Key Facts

Property Value
Base model TinyLlama 1.1B Chat v1.0
Task Natural-Language β†’ SQL generation
Fine-tuning method Low-Rank Adaptation (LoRA) @ rank = 16
Training framework MLX 0.8 + PEFT
Hardware MacBook Pro M4 Pro (20-core GPU)
Checkpoint size 2.1 GB (fp16, fused)
License Apache 2.0

✨ Intended Use

  • Interactive data exploration inside BI notebooks or chatbots.
  • Customer-support analytics β€” empower non-SQL users to ask free-form questions.
  • Education & demos showing how LoRA + MLX enables rapid on-device fine-tuning.

The model was trained on synthetic NL-SQL pairs for demo purposes. Do not deploy it in production for mission-critical SQL generation without additional evaluation on your own schema and security review.


πŸ’» Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = """\
### Database schema
table orders(id, customer_id, total, created_at)
table customers(id, name, country)

### Question
List total sales per country ordered by total descending."""

inputs = tok(prompt, return_tensors="pt")
sql_out = model.generate(**inputs, max_new_tokens=128)
print(tok.decode(sql_out[0], skip_special_tokens=True))

πŸ‹οΈβ€β™‚οΈ Training Details

  • Data – 10 K synthetic NL/SQL pairs auto-generated from the open-domain schema list, then manually spot-checked for correctness.
  • Pre-processing – schema + question paired using the Text-to-SQL prompt pattern; SQL statements lower-cased; no anonymisation.
  • Hyper-parameters
    • batch size = 32 (gradient-accum = 4)
    • learning-rate = 2 e-4 (cosine schedule)
    • epochs = 3
    • LoRA rank = 16, Ξ± = 32
    • fp16 mixed-precision

Total GPU-hours β‰ˆ 5mins on Apple-Silicon.


🌱 Environmental Impact

LoRA fine-tuning on consumer Apple-Silicon is energy-efficient.


πŸ› οΈ Limitations & Biases

  • Trained on a synthetic, limited dataset β†’ may under-perform on real production schemas.
  • Does not perform schema-linking; you must include the relevant schema in the prompt.
  • SQL is not guaranteed to be safe; always validate queries before execution.

✍️ Citation

@misc{mohanan2024tinyllama_sql_lora,
  title   = {TinyLlama-1.1B-Chat-LoRA-Fused-v1.0},
  author  = {Jerome Mohanan},
  note    = {Hugging Face repository: https://huggingface.co/jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0},
  year    = {2024}
}

πŸ“« Contact

Questions or feedback? Ping @jero2rome on Hugging Face or email [email protected].

Downloads last month
3
Safetensors
Model size
1.1B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0

Adapter
(1073)
this model

Dataset used to train jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0