llama3b-legal-sft
Fine-tuned LoRA adapter on Meta Llama-3.2-3B-Instruct, 4-bit quantization
Task: Draft Indian-law documents (eviction notices, affidavits, show-cause notices, leases, POAs, etc.)
Model Details
- Base model:
meta-llama/Llama-3.2-3B-Instruct
- Fine-tuning recipe:
- Data: 2.7 M cleaned Q&A pairs from Prarabdha gated repos
- +11 K examples from
Hashif/indianlegal-llama-2
- 90 % train / 10 % valid split
- 4-bit quant + LoRA (r=8, α=16, dropout=0.1)
- Trainer: custom
SFTTrainer
, fp16, batch=4→16, max_steps=20 000
Evaluation
Metric | Value |
---|---|
Perplexity | 1.53 |
Inference speed on A100: ~0.5 it/s @ bs=1
Limitations & Intended Use
- Intended for drafting legal-style documents under Indian law
- Not a substitute for qualified legal counsel
- May occasionally repeat phrases or lose document structure if prompted poorly
Sample Validation
“✅ Eviction notice generated by this model was reviewed and approved by Advocate Abhishek Chatterjee.”
Usage
from transformers import AutoTokenizer, BitsAndBytesConfig, AutoModelForCausalLM
from peft import PeftModel
import os
HF_TOKEN = os.getenv("HF_TOKEN") # or set directly "hf_xxx"
REPO_ID = "Subimal10/llama3b-legal-sft"
# 1️⃣ Load tokenizer + base model in 4-bit + LoRA adapter
tokenizer = AutoTokenizer.from_pretrained(REPO_ID, use_fast=True)
bnb_cfg = BitsAndBytesConfig(load_in_4bit=True)
base = AutoModelForCausalLM.from_pretrained(
REPO_ID,
quantization_config=bnb_cfg,
device_map="auto",
trust_remote_code=True,
token=HF_TOKEN,
)
model = PeftModel.from_pretrained(base, REPO_ID, device_map="auto", token=HF_TOKEN)
model.eval()
# 2️⃣ Inference with an instruction prompt
prompt = (
"<s>[INST] <<SYS>>\n"
"You are a senior contract lawyer.\n"
"<</SYS>>\n\n"
"### Instruction:\n"
"Draft a formal Show Cause Notice under Indian contract law to a contractor for delays in project delivery.\n"
"### Response:\n"
"[/INST] "
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
gen_ids = model.generate(
**inputs,
max_new_tokens=400,
do_sample=True,
temperature=0.7,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id,
)
completion = tokenizer.decode(gen_ids[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print("=== Show Cause Notice ===\n", completion)
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for Subimal10/llama3b-legal-sft
Base model
meta-llama/Llama-3.2-3B-Instruct