You need to agree to share your contact information to access this model

If you want to learn more about how you can use the model, please refer to our Terms of Use.

Log in or Sign Up to review the conditions and access this model content.

Bielik-11B-v3

Bielik-11B-v3 is a generative text model featuring 11 billion parameters. It is initialized from its predecessor, Mistral-7B-v0.2, and further pre-trained. The aforementioned model stands as a testament to the unique collaboration between the open-science/open-source project SpeakLeash and the High Performance Computing (HPC) center: ACK Cyfronet AGH. Developed and trained on Polish text corpora, which have been cherry-picked and processed by the SpeakLeash team, this endeavor leverages Polish large-scale computing infrastructure, specifically within the PLGrid environment, and more precisely, the HPC center: ACK Cyfronet AGH. The creation and training of the Bielik-11B-v3 was propelled by the support of computational grant number PLG/2024/016951, conducted on the Athena and Helios supercomputer, enabling the use of cutting-edge technology and computational resources essential for large-scale machine learning processes. As a result, the model exhibits an exceptional ability to understand and process the Polish language, providing accurate responses and performing a variety of linguistic tasks with high precision.

โš ๏ธ This is a base model intended for further fine-tuning across most use cases.

Model

The model training was conducted on the Helios Supercomputer at the ACK Cyfronet AGH, utilizing 128 NVidia GH200 cards.

The training dataset was composed of Polish texts collected and made available through the SpeakLeash project, as well as a subset of CommonCrawl data.

Model description:

Downloads last month
-
Safetensors
Model size
11.2B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support