YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
DeepSeek-R1-Distill-Llama-8B Fine-tuned on Finance Dataset
This model is a fine-tuned version of DeepSeek-R1-Distill-Llama-8B using LoRA adapters, trained on financial instruction data.
Model Details
- Base Model: unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
- Fine-tuning Method: LoRA
- LoRA Parameters: r=4, alpha=16
- Target Modules: q_proj, k_proj, v_proj, o_proj
- Training Dataset: Rishi-19/finance-instruct-dataset
Use Cases
This model is optimized for financial analysis, valuation calculations, and financial advisory tasks.
Example Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig
import torch
# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"
# Load the model and tokenizer
model_name = "Rishi-19/deepseek_finetuned_model_rishi"
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Load the base model first
peft_config = PeftConfig.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(
peft_config.base_model_name_or_path,
torch_dtype=torch.float16, # Use half precision to save memory
device_map="auto",
trust_remote_code=True
)
# Then load the PEFT adapter
model = PeftModel.from_pretrained(base_model, model_name)
model.eval() # Set to evaluation mode
# Generate text
inputs = tokenizer("Calculate the Net Present Value of a project with initial investment of $1M", return_tensors="pt").to(device)
with torch.no_grad():
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0]))
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support