YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Asset from the SCALEMED Framework
This model/dataset is an asset released as part of the SCALEMED framework, a project focused on developing scalable and resource-efficient medical AI assistants.
Project Overview
The models, known as DermatoLlama, were trained on versions of the DermaSynth dataset, which was also generated using the SCALEMED pipeline.
For a complete overview of the project, including all related models, datasets, and the source code, please visit our main Hugging Face organization page: https://huggingface.co/DermaVLM
Usage
from transformers import MllamaForConditionalGeneration, AutoProcessor
from peft import PeftModel
from PIL import Image
# Load base model
base_model_name = "meta-llama/Llama-3.2-11B-Vision-Instruct"
model = MllamaForConditionalGeneration.from_pretrained(base_model_name)
processor = AutoProcessor.from_pretrained(base_model_name)
# Load LoRA adapter
adapter_path = "DermaVLM/DermatoLLama-10k"
model = PeftModel.from_pretrained(model, adapter_path)
# Inference
image_path = "DERM12345.jpg"
image = Image.open(image_path).convert("RGB")
prompt_text = "Describe the image in detail."
messages = []
content_list = []
if image:
content_list.append({"type": "image"})
# Add the text part of the prompt
content_list.append({"type": "text", "text": prompt_text})
messages.append({"role": "user", "content": content_list})
input_text = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=False,
)
# Prepare final inputs
inputs = processor(
images=image,
text=input_text,
add_special_tokens=False,
return_tensors="pt",
).to(model.device)
generation_config = {
"max_new_tokens": 256,
"do_sample": True,
"temperature": 0.4,
"top_p": 0.95,
}
input_length = inputs.input_ids.shape[1]
with torch.no_grad():
outputs = model.generate(
**inputs,
**generation_config,
pad_token_id=(
processor.tokenizer.pad_token_id
if processor.tokenizer.pad_token_id is not None
else processor.tokenizer.eos_token_id
),
)
generated_tokens = outputs[0][input_length:]
raw_output = processor.decode(generated_tokens, skip_special_tokens=True)
print(raw_output)
Citation
If you use this model, dataset, or any other asset from our work in your research, we kindly ask that you please cite our preprint:
@article {Yilmaz2025-DermatoLlama-VLM,
author = {Yilmaz, Abdurrahim and Yuceyalcin, Furkan and Varol, Rahmetullah and Gokyayla, Ece and Erdem, Ozan and Choi, Donghee and Demircali, Ali Anil and Gencoglan, Gulsum and Posma, Joram M. and Temelkuran, Burak},
title = {Resource-efficient medical vision language model for dermatology via a synthetic data generation framework},
year = {2025},
doi = {10.1101/2025.05.17.25327785},
url = {https://www.medrxiv.org/content/early/2025/07/30/2025.05.17.25327785},
journal = {medRxiv}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support