Usage instruction
I tried to run this model with --pooling rerank in llama-server but it tells me that it does not find the sep_token and therefore I can't use it. How can I use this model as a reranker with llama cpp server?
according to the help output, there is no "reranker" type, only rank? and shouldn't the default be taken from the model?
if it's not that, llama.cpp might simply not yet support the model (i.e. either conversion produced a broken gguf, or llama-server does have no support).
See this issue https://github.com/ggml-org/llama.cpp/issues/13820 and related merge request https://github.com/ggml-org/llama.cpp/pull/14029
Thanks a lot for finding those issues. That's a lot of changes, but it's not clear to me if the model needs to be requanted. Once this is merged and it doesn't work, you (anybody) can drop us a note and we will redo the model.