🩺 Medical Image Captioning with BLIP on ROCO Dataset

This project fine-tunes the Salesforce/blip-image-captioning-base model on the eltorio/ROCOv2-radiology dataset to generate medical reports from radiology images.


📦 Model Details


🧪 Example Inference

from transformers import BlipProcessor, BlipForConditionalGeneration
from PIL import Image
import requests

processor = BlipProcessor.from_pretrained("khalednabawi11/blip-roco-model")
model = BlipForConditionalGeneration.from_pretrained("khalednabawi11/blip-roco-model")

img_url = "https://example.com/sample-chest-xray.jpg"
image = Image.open(requests.get(img_url, stream=True).raw).convert("RGB")

inputs = processor(images=image, return_tensors="pt").to(model.device)
output = model.generate(**inputs)
caption = processor.decode(output[0], skip_special_tokens=True)

print("Generated Report:", caption)
Downloads last month
140
Safetensors
Model size
247M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using khalednabawi11/blip-roco-model 1