Quants?

by ctranslate2-4you - opened Jan 16

Jan 16

Is this something that could be quantized perhaps and do you know if it's compatible with llama.CPP. And would it work with Transformers.JS and ONNX runtime in general?

tomaarsen

Sentence Transformers org Jan 16

Hello!

It's not compatible with Llama.cpp I'm afraid; that's likely not going to happen either, this model is just too different.
Support for Transformers.js is more plausible (cc @Xenova ) because we already have support for Model2Vec models, and these are quite similar. However, I can't be sure that it'll be added.

Tom Aarsen

Xenova

Jan 16

Done: https://huggingface.co/sentence-transformers/static-retrieval-mrl-en-v1/discussions/2

Conversion code:

import torch
from sentence_transformers import SentenceTransformer

class WrappedModel(torch.nn.Module):
  def __init__(self, m):
    super().__init__()
    self.embedding = m[0].embedding
  def forward(self, input_ids, attention_mask):
    indices = input_ids[attention_mask == 1]
    offsets = torch.cat([torch.tensor([0]), attention_mask.sum(dim=1)[:-1].cumsum(dim=0)])
    return self.embedding(indices, offsets)

shape = (3, 4)
input_ids = torch.tensor([1, 2, 3, 4, 5, 6, -1, -1, 1, 1, 1, 0]).view(shape)
attention_mask = torch.tensor([1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0]).view(shape)

model = SentenceTransformer("tomaarsen/static-retrieval-mrl-en-v1")
wrapped = WrappedModel(model) # test forward pass

# Export the model
torch.onnx.export(wrapped,
                  (input_ids, attention_mask),
                  "model.onnx",
                  export_params=True,
                  opset_version=14,
                  do_constant_folding=True,
                  input_names = ['input_ids', 'attention_mask'],
                  output_names = ['sentence_embedding'],
                  dynamic_axes={
                      'input_ids' : {0 : 'batch_size', 1: 'sequence_length'},
                      'attention_mask' : {0 : 'batch_size', 1: 'sequence_length'},
                      'sentence_embedding' : {0 : 'batch_size'},
                  })

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment