KathirKs commited on
Commit
99ecadc
·
verified ·
1 Parent(s): d797a4b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - KathirKs/fineweb-edu-hindi
5
+ language:
6
+ - en
7
+ - hi
8
+ base_model:
9
+ - google/gemma-2-2b
10
+ pipeline_tag: text-generation
11
+ ---
12
+
13
+
14
+ # Gemma-2b-hindi:
15
+
16
+ Gemma-2b-hindi is a transformer model that is continual pretrained using 300 Billion tokens in Hindi using the Gemma-2-2b as the base model.
17
+
18
+ # Hyperparameters:
19
+ learning_rate: 2E-4 # set low for fine-tuning <br>
20
+ weight_decay: 0.1 <br>
21
+ min_lr_ratio: 0.00225 <br>
22
+ warmup: 0.01 <br>
23
+ decay: 0.99 <br>
24
+ rewarmup: 0.01 <br>
25
+
26
+ # Code:
27
+ The [levanter](https://github.com/stanford-crfm/levanter) repository is used to train the model on google cloud tpus.
28
+ <br> Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC) .
29
+
30
+ # Contact:
31
+
32
+ If you have any queries or issues, reach out to: [Kathir](mailto:[email protected])