Aaltjo commited on
Commit
eb4f8fc
·
verified ·
1 Parent(s): a24668e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -1,15 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # GPT-2 124M Fine-tuned on OpenWebText and Alpacha
2
 
3
  This model is a fine-tuned version of GPT-2 (124M parameters) trained on the OpenWebText dataset and further fine-tuned on the Alpacha dataset.
4
 
5
  ## Model Description
 
6
  This model is based on the GPT-2 architecture and has been fine-tuned on a combination of two datasets:
 
7
  1. **OpenWebText**: The model was initially trained on the OpenWebText dataset for 600,000 iterations.
8
  2. **Alpacha**: The model was further fine-tuned on the Alpacha dataset for the remaining 50,000 iterations.
9
 
10
  The model was trained using a **laptop with an RTX 3060 GPU** for a total of **650,000 iterations** (approximately **8 days** of training).
11
 
12
  ## Hardware Details
 
13
  - **GPU**: Laptop with an **RTX 3060**
14
  - **Training Time**: The model took **8 days** (approximately 650,000 iterations) to train.
15
  - **Total Iterations**: 650,000 iterations (600,000 on OpenWebText + 50,000 on Alpacha).
@@ -29,3 +46,4 @@ inputs = tokenizer(input_text, return_tensors="pt")
29
  outputs = model.generate(**inputs)
30
 
31
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - openwebtext
5
+ - alpacha
6
+ tags:
7
+ - text-generation
8
+ - gpt-2
9
+ - openwebtext
10
+ - alpacha
11
+ model_name: gpt2-124M
12
+ language: en
13
+ ---
14
+
15
  # GPT-2 124M Fine-tuned on OpenWebText and Alpacha
16
 
17
  This model is a fine-tuned version of GPT-2 (124M parameters) trained on the OpenWebText dataset and further fine-tuned on the Alpacha dataset.
18
 
19
  ## Model Description
20
+
21
  This model is based on the GPT-2 architecture and has been fine-tuned on a combination of two datasets:
22
+
23
  1. **OpenWebText**: The model was initially trained on the OpenWebText dataset for 600,000 iterations.
24
  2. **Alpacha**: The model was further fine-tuned on the Alpacha dataset for the remaining 50,000 iterations.
25
 
26
  The model was trained using a **laptop with an RTX 3060 GPU** for a total of **650,000 iterations** (approximately **8 days** of training).
27
 
28
  ## Hardware Details
29
+
30
  - **GPU**: Laptop with an **RTX 3060**
31
  - **Training Time**: The model took **8 days** (approximately 650,000 iterations) to train.
32
  - **Total Iterations**: 650,000 iterations (600,000 on OpenWebText + 50,000 on Alpacha).
 
46
  outputs = model.generate(**inputs)
47
 
48
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
49
+ ```