KathirKs
/

gemma-2b-hindi

Text Generation

Model card Files Files and versions Community

gemma-2b-hindi / README.md

KathirKs's picture

Create README.md

99ecadc verified about 2 months ago

|

history blame contribute delete

808 Bytes

	---
	license: apache-2.0
	datasets:
	- KathirKs/fineweb-edu-hindi
	language:
	- en
	- hi
	base_model:
	- google/gemma-2-2b
	pipeline_tag: text-generation
	---


	# Gemma-2b-hindi:

	Gemma-2b-hindi is a transformer model that is continual pretrained using 300 Billion tokens in Hindi using the Gemma-2-2b as the base model.

	# Hyperparameters:
	learning_rate: 2E-4 # set low for fine-tuning <br>
	weight_decay: 0.1 <br>
	min_lr_ratio: 0.00225 <br>
	warmup: 0.01 <br>
	decay: 0.99 <br>
	rewarmup: 0.01 <br>

	# Code:
	The [levanter](https://github.com/stanford-crfm/levanter) repository is used to train the model on google cloud tpus.
	<br> Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC) .

	# Contact:

	If you have any queries or issues, reach out to: [Kathir](mailto:[email protected])