Vinsingh commited on
Commit
90181c8
·
verified ·
1 Parent(s): 80b5215

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -8,4 +8,8 @@ language:
8
  This is the pytorch model parameters and associated data used for training a small transformer model from scratch.
9
  The transformer model is used to train for translation from hindi_latin to english.
10
 
11
- Among the files, training dataset used to create the model is also there.
 
 
 
 
 
8
  This is the pytorch model parameters and associated data used for training a small transformer model from scratch.
9
  The transformer model is used to train for translation from hindi_latin to english.
10
 
11
+ Among the files, training dataset used to create the model is also there. Data used for training is semi-synthetic.
12
+
13
+ Steps for creating datasets:
14
+ Obtain actualuser questions in hindi and human translations thereof in english.
15
+ Prompt GPT to create variations of key words taking phonetics in account and giving a user persona.