Why the Creative Commons Attribution Non Commercial 4.0 License?

#15
by RichardDeetlefs - opened

Any clue why they went with Creative Commons Attribution Non Commercial 4.0 License for this? It is based on the colbert-ir/colbertv2.0 which is MIT... A little bit annoying. I wonder when Jina AI will change this...

Has anyone found a good alternative, also capable of being Multilingual but with a better license?

Jina AI org

@RichardDeetlefs i appreciate your interest in the model!

the algorithm is colbert, while the model is trained from scratch at Jina. We license all of our model under cc-by-nc. For research and hobby projects this should not have any problem.

Thanks @bwang0911

I did notice that quite some of your (Jina AI) models are Apache License 2.0 and was surprised when I saw this one has cc-by-nc. I presume this will change once you guys have a newer model. I will wait till then

But anyway, you are clearly deep into the subject matter - do you have any alternative Multilingual Late Interaction Retriever Models with a better license to recommend?

Jina AI org

@RichardDeetlefs not really, i believe all of our future models will be licensed under the same. To be honest, i doubt there is any late interaction model supports multilinguality. One called Colbert-xm, which is a multilingual late interaction model (while trained on English data).

if you are interested in using the model without worrying about license/hosting, maybe our API is a good choice as well (https://jina.ai/embeddings/).

link for colbert-xm: https://huggingface.co/antoinelouis/colbert-xm

bwang0911 changed discussion status to closed

@RichardDeetlefs it's because they want money

Sign up or log in to comment