File size: 808 Bytes
99ecadc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
license: apache-2.0
datasets:
- KathirKs/fineweb-edu-hindi
language:
- en
- hi
base_model:
- google/gemma-2-2b
pipeline_tag: text-generation
---


# Gemma-2b-hindi:

Gemma-2b-hindi is a transformer model that is continual pretrained using 300 Billion tokens in Hindi using the Gemma-2-2b as the base model. 

# Hyperparameters:
  learning_rate: 2E-4  # set low for fine-tuning <br>
  weight_decay: 0.1 <br>
  min_lr_ratio: 0.00225 <br>
  warmup: 0.01 <br>
  decay: 0.99 <br>
  rewarmup: 0.01 <br>

# Code:
The [levanter](https://github.com/stanford-crfm/levanter) repository is used to train the model on google cloud tpus. 
<br> Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC) .

# Contact:

If you have any queries or issues, reach out to: [Kathir](mailto:[email protected])