DeepSeek R1 Medical Reasoning (Fine-Tuned with LoRA)

This repository contains the LoRA fine-tuned DeepSeek-R1-Distill-Llama-8B model, specifically adapted for advanced medical reasoning tasks. It was trained on a subset of the Medical O1 Reasoning SFT dataset using Low-Rank Adaptation (LoRA) for efficient fine-tuning.

Model Information

  • Base Model: unsloth/DeepSeek-R1-Distill-Llama-8B
  • LoRA Rank: 16
  • LoRA Alpha: 16
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Dataset: Medical O1 Reasoning SFT

Fine-tuning Configuration

  • Epochs: 1
  • Max Steps: 60
  • Batch Size per Device: 2
  • Gradient Accumulation Steps: 4
  • Learning Rate: 2e-4
  • Optimizer: AdamW (8-bit)
  • Quantization: FP16
  • Seed: 3407 (for reproducibility)

Targeted Modules for LoRA

  • Self-attention projections (q_proj, k_proj, v_proj, o_proj)
  • Feed-forward layers (gate_proj, up_proj, down_proj)

Usage

from unsloth import FastLanguageModel
from transformers import AutoTokenizer

model_id = "NikkeS/deepSeek-finetuned-Medical-O1-Reasoning-SFT"

# Load fine-tuned model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_id,
    max_seq_length=2048,
    load_in_4bit=True
)

# Set to inference mode
FastLanguageModel.for_inference(model)

# Example inference
question = """A 61-year-old woman with involuntary urine loss during coughing but no leakage at night undergoes a gynecological exam and Q-tip test. What would cystometry reveal about residual volume and detrusor contractions?"""

inputs = tokenizer([question], return_tensors="pt").to("cuda")
outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200
)
response = tokenizer.batch_decode(outputs)[0]
print(response)

Repository


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NikkeS/deepSeek-finetuned-Medical-O1-Reasoning-SFT

Dataset used to train NikkeS/deepSeek-finetuned-Medical-O1-Reasoning-SFT