--- license: apache-2.0 datasets: - KathirKs/fineweb-edu-hindi language: - en - hi base_model: - google/gemma-2-2b pipeline_tag: text-generation --- # Gemma-2b-hindi: Gemma-2b-hindi is a transformer model that is continual pretrained using 300 Billion tokens in Hindi using the Gemma-2-2b as the base model. # Hyperparameters: learning_rate: 2E-4 # set low for fine-tuning
weight_decay: 0.1
min_lr_ratio: 0.00225
warmup: 0.01
decay: 0.99
rewarmup: 0.01
# Code: The [levanter](https://github.com/stanford-crfm/levanter) repository is used to train the model on google cloud tpus.
Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC) . # Contact: If you have any queries or issues, reach out to: [Kathir](mailto:kathirksw@gmail.com)