riemanli
/

lora-gpt2-e2e-reproduce

@@ -20,8 +20,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [gpt2-medium](https://huggingface.co/gpt2-medium) on the e2e_nlg dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.6841
-- Bleu: 0.1572
 ## Model description
@@ -47,29 +47,31 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 500
-- num_epochs: 15
 - mixed_precision_training: Native AMP
 - label_smoothing_factor: 0.1
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Bleu   |
-|:-------------:|:-----:|:----:|:---------------:|:------:|
-| 5.5605        | 1.0   | 25   | 5.0293          | 0.0    |
-| 5.6174        | 2.0   | 50   | 4.9749          | 0.0    |
-| 5.3359        | 3.0   | 75   | 4.8386          | 0.0255 |
-| 5.0451        | 4.0   | 100  | 4.5509          | 0.0143 |
-| 4.6183        | 5.0   | 125  | 4.0375          | 0.0536 |
-| 3.9072        | 6.0   | 150  | 3.5277          | 0.0052 |
-| 3.6058        | 7.0   | 175  | 3.2481          | 0.1550 |
-| 3.4162        | 8.0   | 200  | 3.0935          | 0.0140 |
-| 3.2618        | 9.0   | 225  | 2.9592          | 0.0    |
-| 3.1868        | 10.0  | 250  | 2.8875          | 0.0196 |
-| 3.1306        | 11.0  | 275  | 2.8068          | 0.0    |
-| 3.0673        | 12.0  | 300  | 2.7307          | 0.0    |
-| 3.054         | 13.0  | 325  | 2.6913          | 0.0    |
-| 2.9306        | 14.0  | 350  | 2.6773          | 0.0    |
-| 2.9358        | 15.0  | 375  | 2.6841          | 0.1572 |
 ### Framework versions

 This model is a fine-tuned version of [gpt2-medium](https://huggingface.co/gpt2-medium) on the e2e_nlg dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.4493
+- Bleu: 0.3781
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 500
+- num_epochs: 10
 - mixed_precision_training: Native AMP
 - label_smoothing_factor: 0.1
 ### Training results
+| Training Loss | Epoch  | Step  | Validation Loss | Bleu   |
+|:-------------:|:------:|:-----:|:---------------:|:------:|
+| 2.9523        | 0.5706 | 3000  | 2.6028          | 0.3489 |
+| 2.6924        | 1.1411 | 6000  | 2.5544          | 0.3501 |
+| 2.6493        | 1.7117 | 9000  | 2.5217          | 0.4052 |
+| 2.6252        | 2.2822 | 12000 | 2.5048          | 0.3894 |
+| 2.6023        | 2.8528 | 15000 | 2.4957          | 0.4060 |
+| 2.5962        | 3.4234 | 18000 | 2.4863          | 0.3772 |
+| 2.5797        | 3.9939 | 21000 | 2.4812          | 0.3697 |
+| 2.5691        | 4.5645 | 24000 | 2.4746          | 0.3864 |
+| 2.5677        | 5.1350 | 27000 | 2.4708          | 0.3709 |
+| 2.553         | 5.7056 | 30000 | 2.4648          | 0.3787 |
+| 2.5567        | 6.2762 | 33000 | 2.4610          | 0.3754 |
+| 2.5469        | 6.8467 | 36000 | 2.4593          | 0.3670 |
+| 2.5422        | 7.4173 | 39000 | 2.4566          | 0.3663 |
+| 2.5376        | 7.9878 | 42000 | 2.4548          | 0.3621 |
+| 2.534         | 8.5584 | 45000 | 2.4538          | 0.3812 |
+| 2.5279        | 9.1289 | 48000 | 2.4532          | 0.3695 |
+| 2.5273        | 9.6995 | 51000 | 2.4493          | 0.3781 |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:158332bf421ea8ea6eb7a4630e0c408d5d683437c138e3c74e11a9042e685b08
 size 1578960

 version https://git-lfs.github.com/spec/v1
+oid sha256:e2c9de6ba27c90dc7fe95235321d94b2ef8a974e6c63b14275221b6f7cd06643
 size 1578960