🩺 Medical Image Captioning with BLIP on ROCO Dataset
This project fine-tunes the Salesforce/blip-image-captioning-base
model on the eltorio/ROCOv2-radiology
dataset to generate medical reports from radiology images.
📦 Model Details
- Base Model:
Salesforce/blip-image-captioning-base
- Dataset:
eltorio/ROCOv2-radiology
- Task: Medical image captioning (radiology report generation)
- Framework: 🤗 Transformers
- Training Objective: Generate radiology captions conditioned on image inputs
🧪 Example Inference
from transformers import BlipProcessor, BlipForConditionalGeneration
from PIL import Image
import requests
processor = BlipProcessor.from_pretrained("khalednabawi11/blip-roco-model")
model = BlipForConditionalGeneration.from_pretrained("khalednabawi11/blip-roco-model")
img_url = "https://example.com/sample-chest-xray.jpg"
image = Image.open(requests.get(img_url, stream=True).raw).convert("RGB")
inputs = processor(images=image, return_tensors="pt").to(model.device)
output = model.generate(**inputs)
caption = processor.decode(output[0], skip_special_tokens=True)
print("Generated Report:", caption)
- Downloads last month
- 140
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support