OpenLlama-3B-v2 Fine-tuned with GRIT and QLoRA

This model is a fine-tuned version of openlm-research/open_llama_3b_v2 using the GRIT (Gradient Regularized Instruction Tuning) algorithm and QLoRA on the Alpaca dataset.

The base model is quantized to 4-bit (NF4) to enable efficient fine-tuning.

πŸš€ Training Details

GRIT Algorithm

  • K-FAC Updates: Every 200 steps for second-order preconditioning
  • Neural Reprojection: Every 500 steps for rank optimization
  • Optimized LoRA Modules: attention + key MLP layers (as per design)

Fine-tuning Configuration

  • Base Model: OpenLlama 3B v2
  • Quantization: 4-bit (NF4) with float16 compute
  • LoRA Rank: 64
  • LoRA Alpha: 128
  • Batch Size: 16 (per device)
  • Gradient Accumulation: 4 (Effective batch = 64)
  • Learning Rate: 5.0e-05
  • Precision: bf16 mixed precision
  • Sequence Length: 512 tokens
  • Gradient Checkpointing: Enabled

Performance Improvements

  • βœ… Faster Convergence: K-FAC preconditioning aligns updates with curvature
  • βœ… Memory-Efficient: 4-bit quantization (QLoRA) and gradient checkpointing used.
  • βœ… Efficient Training: Utilizes accelerate for efficient training.

πŸ“Š Training Metrics

  • Total Steps: 732
  • Final Training Loss: 0.2282
  • Final Validation Loss: 0.22849
  • BLEU (val): 0.2452
  • Trainable Params: 42,598,400 (1.23% of total)

🏷️ Model Tags

  • Instruction-tuned with GRIT and QLoRA
  • GRIT-tuned Model
  • 4-bit Quantized Model
  • LoRA rank 64
  • Mixed precision (bf16)
  • Alpaca dataset fine-tuning

πŸ“ Algorithm Details

  • K-FAC Preconditioning (Natural Gradient) and Neural Reprojection as per GRIT method
  • Memory Efficient: Covariance matrices on CPU to reduce GPU load

πŸ† Results

In benchmark comparisons, GRIT has shown faster convergence and better stability than standard LoRA or fine-tuning, making it well-suited for efficient single-epoch training.

πŸ“ Citation

If you use this model, please cite:

@misc{grit-openllama-3b-alpaca,
  title={OpenLlama 3B v2 Fine-tuned with GRIT on Alpaca},
  author={Pritish92},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/Pritish92/open-llama-3b-v2-grit-alpaca}
}

βš–οΈ License

This model inherits the Apache 2.0 license.

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Pritish92/open-llama-3b-v2-grit-alpaca

Adapter
(54)
this model

Dataset used to train Pritish92/open-llama-3b-v2-grit-alpaca