Meditron3-8B LoRA Adapter for Medical MCQ JSON Generation

This is a LoRA (Low-Rank Adaptation) adapter for the OpenMeditron/Meditron3-8B model, fine-tuned for medical multiple-choice question answering with structured JSON output generation.

Model Details

Base Model

Model: OpenMeditron/Meditron3-8B
Architecture: Llama-based medical language model
Parameters: 8B parameters
Precision: BFloat16

LoRA Configuration

Rank (r): 64
Alpha: 128
Dropout: 0.1
Target Modules: v_proj, q_proj, o_proj, k_proj, down_proj, gate_proj, up_proj
Task Type: Causal Language Modeling

Training Details

Dataset

Source: asanchez75/medical_textbooks_mcq
Domain: Medical multiple-choice questions
Language: Primarily French medical content
Format: JSON-structured input/output pairs
Size: 1,481 examples (1,184 train, 148 validation, 149 test)

Training Configuration

Epochs: 3
Learning Rate: 2e-5
Batch Size: 4 (per device)
Gradient Accumulation: 4 steps
Effective Batch Size: 16
Sequence Length: 2048 tokens
Hardware: NVIDIA A100 SXM4 40GB

Performance

Final Test Loss: 0.7995
Training Time: ~18.5 minutes (1,107 seconds)
Memory Usage: 23.2GB peak (A100 40GB)
LoRA Memory Usage: 7.59GB additional for training

Usage

Loading the Adapter

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "OpenMeditron/Meditron3-8B",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("OpenMeditron/Meditron3-8B")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "YOUR_HF_USERNAME/REPO_NAME")

Inference Example

import json

# Input format (medical context text)
input_text = "L'hypertension artérielle essentielle est une maladie chronique caractérisée par une pression artérielle élevée. Le traitement de première intention comprend les modifications du mode de vie et les médicaments antihypertenseurs."

# Format prompt using the same structure as training
prompt_prefix = "<|user|>
Context:
"
prompt_suffix = "

Generate ONE valid multiple-choice question based strictly on the context above. Output ONLY the valid JSON object representing the question.
MCQ JSON:<|end|>
<|assistant|>
"

# Generate response
formatted_prompt = prompt_prefix + input_text + prompt_suffix
inputs = tokenizer(formatted_prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.0)
    
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Expected Output Format

{
    "question": "Quel est le traitement de première intention de l'hypertension artérielle essentielle?",
    "options": {
        "A": "Inhibiteurs de l'ECA",
        "B": "Bêta-bloquants",
        "C": "Diurétiques thiazidiques",
        "D": "Antagonistes calciques"
    },
    "correct_answer": "A",
    "explanation": "Les inhibiteurs de l'ECA sont recommandés en première intention pour le traitement de l'hypertension artérielle essentielle selon les guidelines internationales."
}

Model Architecture

This adapter targets the following modules in the Meditron3-8B model:

Query projection (q_proj)
Key projection (k_proj)
Value projection (v_proj)
Output projection (o_proj)
Gate projection (gate_proj)
Up projection (up_proj)
Down projection (down_proj)

Limitations and Biases

Domain Specific: Optimized for French medical content
MCQ Format: Designed for structured multiple-choice questions
Medical Focus: Performance may vary on non-medical content
Language: Primarily trained on French medical terminology

Citation

If you use this model, please cite:

@misc{meditron3-8b-mcq-lora,
    title={Meditron3-8B LoRA Adapter for Medical MCQ JSON Generation},
    author={Your Name},
    year={2025},
    publisher={Hugging Face},
    url={https://huggingface.co/YOUR_USERNAME/REPO_NAME}
}

License

This adapter is released under the Apache-2.0 license, consistent with the base Meditron3-8B model.

asanchez75
/

meditron3-8b-mcq-lora