softhell
/

code_docstring_model

@@ -23,7 +23,7 @@ model-index:
     metrics:
     - name: Bleu
       type: bleu
-      value: 0.013233021060148939
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -33,8 +33,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [Salesforce/codet5-base](https://huggingface.co/Salesforce/codet5-base) on the code_search_net dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.9120
-- Bleu: 0.0132
 ## Model description
@@ -55,25 +55,27 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 64
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Bleu   |
 |:-------------:|:------:|:----:|:---------------:|:------:|
-| 1.0906        | 1.0    | 1002 | 0.9715          | 0.0107 |
-| 0.9922        | 2.0    | 2004 | 0.9390          | 0.0108 |
-| 0.9325        | 3.0    | 3006 | 0.9233          | 0.0113 |
-| 0.8936        | 4.0    | 4008 | 0.9134          | 0.0124 |
-| 0.8769        | 4.9953 | 5005 | 0.9120          | 0.0132 |
 ### Framework versions

     metrics:
     - name: Bleu
       type: bleu
+      value: 0.01865929848556658
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [Salesforce/codet5-base](https://huggingface.co/Salesforce/codet5-base) on the code_search_net dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.9051
+- Bleu: 0.0187
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 16
+- eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 64
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 7
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Bleu   |
 |:-------------:|:------:|:----:|:---------------:|:------:|
+| 1.111         | 1.0    | 1002 | 0.9781          | 0.0108 |
+| 0.998         | 2.0    | 2004 | 0.9397          | 0.0109 |
+| 0.9295        | 3.0    | 3006 | 0.9204          | 0.0120 |
+| 0.8814        | 4.0    | 4008 | 0.9088          | 0.0159 |
+| 0.8557        | 5.0    | 5010 | 0.9064          | 0.0171 |
+| 0.8364        | 6.0    | 6012 | 0.9055          | 0.0180 |
+| 0.8184        | 6.9933 | 7007 | 0.9051          | 0.0187 |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d74d7f8df141d0f53996700750693771b6f8aedeed3981d0a9e4c922ce32460f
 size 891638568

 version https://git-lfs.github.com/spec/v1
+oid sha256:5d6155aa929b67c581637d2a93a789bd7c6d5f36dab36c0639158feead435206
 size 891638568

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:353600229a52d1bc40f4e2190c68b4e1e545a3595edbe90c69597e2cc3abf322
 size 5496

 version https://git-lfs.github.com/spec/v1
+oid sha256:e9e6e90433f234f69fc1383db1ae79466a3171c73f1ab31cd9ddbdc3f15ba214
 size 5496