Difference between versions?
#2
by
dipta007
- opened
What is the difference between V1.0, V1.1 and V2.0?
BY THE WAY, great work.
Hi
@dipta007
Version 1.0 means we trained the model with a mix fraction of the data with 6B Bangla tokens.
Version 1.1 means who increased the data to 8.5B tokens.
In version 1 we didn’t extend the tokenizer with Bangla token.
Version 2 means we extended the tokenizer with additional Bangla tokens and data size is 37B Bangla tokens.
Thanks for your appreciation.
@sagorsarker
Wow nice work open-sourcing all the intermediate models.
Do you have any plans to open-source any of the datasets if you have not done so already?
We have plans to open sourcing the pretraining datasets. It will happen soon, I am hoping.