HOW TO USE WITH TEI

#3
by nbroad HF staff - opened
No description provided.
nbroad changed pull request status to open
nbroad changed pull request title from TEI to HOW TO USE WITH TEI

Set up the container:

model=ibm/re2g-reranker-nq
volume=$PWD/data
# specify this PR revision
revision=refs/pr/3

docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.6 --model-id $model --revision $revision

Call the endpoint

Because this model has 2 classes, it can't be considered a re-ranker. Re-rankers only have 1. Thus, the predict route must be used to treat the model as a classifier.

curl 127.0.0.1:8080/predict \
     -X POST \
     -d '{"inputs": ["What is deep learning?", "Deep Learning\n\nDL is about machine learning and ai"]}' \
    -H 'Content-Type: application/json'

Text needs to be passed as a pair: ["Query", "Title\n\nPassage"] as mentioned here

## Note about formatting [WIP]

According to the code, I believe it is using the facebook/rag-token-nq tokenizer: see here

1. Example script (calls string_retreive): https://github.com/IBM/kgi-slot-filling/blob/re2g/dpr/dpr_apply.py#L60
2. string_retrieve calls prepare_seq2seq_batch: https://github.com/IBM/kgi-slot-filling/blob/re2g/corpus/corpus_client.py#L108
3. prepare_seq2seq_batch calls tokenizer.question_encoder: https://github.com/IBM/kgi-slot-filling/blob/re2g/generation/rag_util.py#L268C5-L268C26
4. tokenizer.question_encoder is just the tokenizer that is passed to RagTokenizer.from_pretrained: https://github.com/huggingface/transformers/blob/5fa35344755d8d9c29610b57d175efd03776ae9e/src/transformers/models/rag/tokenization_rag.py#L54
5. The actual tokenizer to RagTokenizer is DPRTokenizer (see code here, and files in model)
6. Here is the code for calling the DPRTokenizer

tokenizer(
        src_texts,
        add_special_tokens=True,
        max_length=512,
        padding="longest",
        truncation=True,
    )

6. I am assuming that the format is [CLS] query [SEP] passage [SEP] because of this code and because that is what happens when you pass tokenizer.decode(tokenizer.encode("query", "passage"))
7. Since TEI will add the BOS and EOS tokens ([CLS] and [SEP]), only the middle SEP token needs to be added.

Cannot merge
This branch has merge conflicts in the following files:
  • model.safetensors

Sign up or log in to comment