Fork of salesforce/BLIP for a image-captioning task on 🤗Inference endpoint.

This repository implements a custom task for image-captioning for 🤗 Inference Endpoints. The code for the customized pipeline is in the pipeline.py. To use deploy this model a an Inference Endpoint you have to select Custom as task to use the pipeline.py file. -> double check if it is selected

expected Request payload

{
  "image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes
}

below is an example on how to run a request using Python and requests.

Run Request

  1. prepare an image.
!wget https://huggingface.co/datasets/mishig/sample_images/resolve/main/palace.jpg

2.run request

import json
from typing import List
import requests as r
import base64

ENDPOINT_URL = ""
HF_TOKEN = ""

def predict(path_to_image: str = None):
    with open(path_to_image, "rb") as i:
        image = i.read()
    payload = {
        "inputs": [image],
        "parameters": {
                   "do_sample": True,
                   "top_p":0.9,
                   "min_length":5,
                   "max_length":20
        }
    }
    response = r.post(
        ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
    )
    return response.json()
prediction = predict(
    path_to_image="palace.jpg"
)

Example parameters depending on the decoding strategy:

  1. Beam search
        "parameters": {
                   "num_beams":5,
                   "max_length":20
        }
  1. Nucleus sampling
        "parameters": {
                   "num_beams":1,
                   "max_length":20,
                   "do_sample": True,
                   "top_k":50,
                   "top_p":0.95
        }
  1. Contrastive search
        "parameters": {
                   "penalty_alpha":0.6,
                   "top_k":4
                   "max_length":512
        }

See generate() doc for additional detail

expected output

['buckingham palace with flower beds and red flowers']
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support image-to-text models for generic library.

Space using florentgbelidji/blip_captioning 1