riemanli commited on
Commit
9ec9fb7
·
verified ·
1 Parent(s): d8ae717

lora_gpt2_paper_params

Browse files
Files changed (2) hide show
  1. README.md +22 -20
  2. adapter_model.safetensors +1 -1
README.md CHANGED
@@ -20,8 +20,8 @@ should probably proofread and complete it, then remove this comment. -->
20
 
21
  This model is a fine-tuned version of [gpt2-medium](https://huggingface.co/gpt2-medium) on the e2e_nlg dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 2.6841
24
- - Bleu: 0.1572
25
 
26
  ## Model description
27
 
@@ -47,29 +47,31 @@ The following hyperparameters were used during training:
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
  - lr_scheduler_warmup_steps: 500
50
- - num_epochs: 15
51
  - mixed_precision_training: Native AMP
52
  - label_smoothing_factor: 0.1
53
 
54
  ### Training results
55
 
56
- | Training Loss | Epoch | Step | Validation Loss | Bleu |
57
- |:-------------:|:-----:|:----:|:---------------:|:------:|
58
- | 5.5605 | 1.0 | 25 | 5.0293 | 0.0 |
59
- | 5.6174 | 2.0 | 50 | 4.9749 | 0.0 |
60
- | 5.3359 | 3.0 | 75 | 4.8386 | 0.0255 |
61
- | 5.0451 | 4.0 | 100 | 4.5509 | 0.0143 |
62
- | 4.6183 | 5.0 | 125 | 4.0375 | 0.0536 |
63
- | 3.9072 | 6.0 | 150 | 3.5277 | 0.0052 |
64
- | 3.6058 | 7.0 | 175 | 3.2481 | 0.1550 |
65
- | 3.4162 | 8.0 | 200 | 3.0935 | 0.0140 |
66
- | 3.2618 | 9.0 | 225 | 2.9592 | 0.0 |
67
- | 3.1868 | 10.0 | 250 | 2.8875 | 0.0196 |
68
- | 3.1306 | 11.0 | 275 | 2.8068 | 0.0 |
69
- | 3.0673 | 12.0 | 300 | 2.7307 | 0.0 |
70
- | 3.054 | 13.0 | 325 | 2.6913 | 0.0 |
71
- | 2.9306 | 14.0 | 350 | 2.6773 | 0.0 |
72
- | 2.9358 | 15.0 | 375 | 2.6841 | 0.1572 |
 
 
73
 
74
 
75
  ### Framework versions
 
20
 
21
  This model is a fine-tuned version of [gpt2-medium](https://huggingface.co/gpt2-medium) on the e2e_nlg dataset.
22
  It achieves the following results on the evaluation set:
23
+ - Loss: 2.4493
24
+ - Bleu: 0.3781
25
 
26
  ## Model description
27
 
 
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
  - lr_scheduler_warmup_steps: 500
50
+ - num_epochs: 10
51
  - mixed_precision_training: Native AMP
52
  - label_smoothing_factor: 0.1
53
 
54
  ### Training results
55
 
56
+ | Training Loss | Epoch | Step | Validation Loss | Bleu |
57
+ |:-------------:|:------:|:-----:|:---------------:|:------:|
58
+ | 2.9523 | 0.5706 | 3000 | 2.6028 | 0.3489 |
59
+ | 2.6924 | 1.1411 | 6000 | 2.5544 | 0.3501 |
60
+ | 2.6493 | 1.7117 | 9000 | 2.5217 | 0.4052 |
61
+ | 2.6252 | 2.2822 | 12000 | 2.5048 | 0.3894 |
62
+ | 2.6023 | 2.8528 | 15000 | 2.4957 | 0.4060 |
63
+ | 2.5962 | 3.4234 | 18000 | 2.4863 | 0.3772 |
64
+ | 2.5797 | 3.9939 | 21000 | 2.4812 | 0.3697 |
65
+ | 2.5691 | 4.5645 | 24000 | 2.4746 | 0.3864 |
66
+ | 2.5677 | 5.1350 | 27000 | 2.4708 | 0.3709 |
67
+ | 2.553 | 5.7056 | 30000 | 2.4648 | 0.3787 |
68
+ | 2.5567 | 6.2762 | 33000 | 2.4610 | 0.3754 |
69
+ | 2.5469 | 6.8467 | 36000 | 2.4593 | 0.3670 |
70
+ | 2.5422 | 7.4173 | 39000 | 2.4566 | 0.3663 |
71
+ | 2.5376 | 7.9878 | 42000 | 2.4548 | 0.3621 |
72
+ | 2.534 | 8.5584 | 45000 | 2.4538 | 0.3812 |
73
+ | 2.5279 | 9.1289 | 48000 | 2.4532 | 0.3695 |
74
+ | 2.5273 | 9.6995 | 51000 | 2.4493 | 0.3781 |
75
 
76
 
77
  ### Framework versions
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:158332bf421ea8ea6eb7a4630e0c408d5d683437c138e3c74e11a9042e685b08
3
  size 1578960
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e2c9de6ba27c90dc7fe95235321d94b2ef8a974e6c63b14275221b6f7cd06643
3
  size 1578960