Why the Creative Commons Attribution Non Commercial 4.0 License?
Any clue why they went with Creative Commons Attribution Non Commercial 4.0 License for this? It is based on the colbert-ir/colbertv2.0 which is MIT... A little bit annoying. I wonder when Jina AI will change this...
Has anyone found a good alternative, also capable of being Multilingual but with a better license?
@RichardDeetlefs i appreciate your interest in the model!
the algorithm is colbert, while the model is trained from scratch at Jina. We license all of our model under cc-by-nc. For research and hobby projects this should not have any problem.
Thanks @bwang0911
I did notice that quite some of your (Jina AI) models are Apache License 2.0 and was surprised when I saw this one has cc-by-nc. I presume this will change once you guys have a newer model. I will wait till then
But anyway, you are clearly deep into the subject matter - do you have any alternative Multilingual Late Interaction Retriever Models with a better license to recommend?
@RichardDeetlefs not really, i believe all of our future models will be licensed under the same. To be honest, i doubt there is any late interaction model supports multilinguality. One called Colbert-xm, which is a multilingual late interaction model (while trained on English data).
if you are interested in using the model without worrying about license/hosting, maybe our API is a good choice as well (https://jina.ai/embeddings/).
link for colbert-xm: https://huggingface.co/antoinelouis/colbert-xm
@RichardDeetlefs it's because they want money