Difference between versions?

#2
by dipta007 - opened

What is the difference between V1.0, V1.1 and V2.0?

BY THE WAY, great work.

Hi @dipta007
Version 1.0 means we trained the model with a mix fraction of the data with 6B Bangla tokens.
Version 1.1 means who increased the data to 8.5B tokens.
In version 1 we didn’t extend the tokenizer with Bangla token.

Version 2 means we extended the tokenizer with additional Bangla tokens and data size is 37B Bangla tokens.

Thanks for your appreciation.

@sagorsarker Wow nice work open-sourcing all the intermediate models.
Do you have any plans to open-source any of the datasets if you have not done so already?

Hishab org

We have plans to open sourcing the pretraining datasets. It will happen soon, I am hoping.

Sign up or log in to comment