Update README.md
Browse files
README.md
CHANGED
@@ -8,4 +8,8 @@ language:
|
|
8 |
This is the pytorch model parameters and associated data used for training a small transformer model from scratch.
|
9 |
The transformer model is used to train for translation from hindi_latin to english.
|
10 |
|
11 |
-
Among the files, training dataset used to create the model is also there.
|
|
|
|
|
|
|
|
|
|
8 |
This is the pytorch model parameters and associated data used for training a small transformer model from scratch.
|
9 |
The transformer model is used to train for translation from hindi_latin to english.
|
10 |
|
11 |
+
Among the files, training dataset used to create the model is also there. Data used for training is semi-synthetic.
|
12 |
+
|
13 |
+
Steps for creating datasets:
|
14 |
+
Obtain actualuser questions in hindi and human translations thereof in english.
|
15 |
+
Prompt GPT to create variations of key words taking phonetics in account and giving a user persona.
|