Question about license
Hi!
Thank you very much for open sourcing the model!
Pardon my ignorance about software license.
The pre-training dataset used for training Lucie-7B is licensed under CC-BY-NC-4.0
which makes it for non-commercial use. However the model weights are licensed under apache-2.0
which makes it ok for commercial use.
Does this means the licenses of the pre-training dataset doesn't matter when determining the commercial suitability of the model weights?
Here are some answers, based on discussions within OSI and OSAID:
(1) The AI model should not be considered as a work derived from the data, which means that the license of the training dataset does not impact the license of the model.
(2) For an AI system to be considered open source, it must provide (i) the model under an open source license with no usage restrictions (ii) the training and data processing codes under an open source license (iii) the complete list of data used for training, and if possible the data itself.
cf. https://opensource.org/ai/open-source-ai-definition
That what we did for Lucie 7B
I understand now, thank you very much for your reply!