Llama for Finance

A financial domain instruction-tuned Llama-3 model using LoRA on the Finance-Instruct-500k dataset.

Model Details

  • Base Model: meta-llama/Meta-Llama-3.1-8B-Instruct
  • Training: LoRA fine-tuning
  • Domain: Finance, Economics, Investment
  • Language: English
  • Context Length: 512 tokens (training max_length)
  • Training Data: Josephgflowers/Finance-Instruct-500k
  • Evaluation: Held-out test + FinanceBench

Training Configuration

  • Quantization: 8-bit quantization
  • Batch Size: 2 per device
  • Gradient Accumulation Steps: 8
  • Learning Rate: 2e-4
  • Number of Epochs: 1
  • Evaluation Steps: 50
  • Save Steps: 100
  • Logging Steps: 25

LoRA Parameters

  • Target Modules:
    • Attention: q_proj, k_proj, v_proj, o_proj
    • MLP: gate_proj, up_proj, down_proj
  • Rank (r): 16
  • Alpha: 32
  • Dropout: 0.1

Optimization Details

  • Precision: BF16 (if available) or FP16
  • Gradient Checkpointing: Enabled
  • Scheduler: Cosine with warmup (ratio: 0.03)
  • Weight Decay: 0.01
  • Max Gradient Norm: 1.0
  • Data Loading: 2 workers, pinned memory

Usage

This is a LoRA adapter for Llama-3. You need access to the base model.

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
model = PeftModel.from_pretrained(base_model, "TimberGu/Llama_for_Finance")
tokenizer = AutoTokenizer.from_pretrained("TimberGu/Llama_for_Finance")

Evaluation Results

The model has been evaluated on:

  1. Held-out test set from Finance-Instruct-500k
  2. FinanceBench open-book QA benchmark

See test_results.json for detailed metrics including:

  • BLEU scores
  • ROUGE-1/2/L scores
  • Perplexity

Limitations

  • Requires access to Meta's Llama-3 base model, make sure your hardware has enough memory to load the model
  • Performance may vary on non-financial topics
  • Should not be used as sole source for financial decisions
  • Training context length limited to 512 tokens because of limited GPU memory
Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for TimberGu/Llama_for_Finance

Adapter
(1011)
this model