softhell commited on
Commit
bb05017
·
verified ·
1 Parent(s): 2645ed4

codet5_base_train_e5

Browse files
Files changed (4) hide show
  1. README.md +17 -16
  2. config.json +6 -6
  3. model.safetensors +2 -2
  4. training_args.bin +1 -1
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
- base_model: Salesforce/codet5-small
5
  tags:
6
  - generated_from_trainer
7
  datasets:
@@ -23,7 +23,7 @@ model-index:
23
  metrics:
24
  - name: Bleu
25
  type: bleu
26
- value: 0.009912737763560728
27
  ---
28
 
29
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -31,10 +31,10 @@ should probably proofread and complete it, then remove this comment. -->
31
 
32
  # code_docstring_model
33
 
34
- This model is a fine-tuned version of [Salesforce/codet5-small](https://huggingface.co/Salesforce/codet5-small) on the code_search_net dataset.
35
  It achieves the following results on the evaluation set:
36
- - Loss: 1.1194
37
- - Bleu: 0.0099
38
 
39
  ## Model description
40
 
@@ -53,26 +53,27 @@ More information needed
53
  ### Training hyperparameters
54
 
55
  The following hyperparameters were used during training:
56
- - learning_rate: 1e-05
57
- - train_batch_size: 8
58
- - eval_batch_size: 8
59
  - seed: 42
60
  - gradient_accumulation_steps: 4
61
- - total_train_batch_size: 32
62
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
63
  - lr_scheduler_type: linear
 
64
  - num_epochs: 5
65
  - mixed_precision_training: Native AMP
66
 
67
  ### Training results
68
 
69
- | Training Loss | Epoch | Step | Validation Loss | Bleu |
70
- |:-------------:|:------:|:-----:|:---------------:|:------:|
71
- | 1.4113 | 1.0 | 2004 | 1.2082 | 0.0095 |
72
- | 1.3283 | 2.0 | 4008 | 1.1537 | 0.0097 |
73
- | 1.3036 | 3.0 | 6012 | 1.1331 | 0.0098 |
74
- | 1.2585 | 4.0 | 8016 | 1.1226 | 0.0098 |
75
- | 1.2613 | 4.9978 | 10015 | 1.1194 | 0.0099 |
76
 
77
 
78
  ### Framework versions
 
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
+ base_model: Salesforce/codet5-base
5
  tags:
6
  - generated_from_trainer
7
  datasets:
 
23
  metrics:
24
  - name: Bleu
25
  type: bleu
26
+ value: 0.013233021060148939
27
  ---
28
 
29
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
31
 
32
  # code_docstring_model
33
 
34
+ This model is a fine-tuned version of [Salesforce/codet5-base](https://huggingface.co/Salesforce/codet5-base) on the code_search_net dataset.
35
  It achieves the following results on the evaluation set:
36
+ - Loss: 0.9120
37
+ - Bleu: 0.0132
38
 
39
  ## Model description
40
 
 
53
  ### Training hyperparameters
54
 
55
  The following hyperparameters were used during training:
56
+ - learning_rate: 5e-05
57
+ - train_batch_size: 16
58
+ - eval_batch_size: 16
59
  - seed: 42
60
  - gradient_accumulation_steps: 4
61
+ - total_train_batch_size: 64
62
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
63
  - lr_scheduler_type: linear
64
+ - lr_scheduler_warmup_ratio: 0.1
65
  - num_epochs: 5
66
  - mixed_precision_training: Native AMP
67
 
68
  ### Training results
69
 
70
+ | Training Loss | Epoch | Step | Validation Loss | Bleu |
71
+ |:-------------:|:------:|:----:|:---------------:|:------:|
72
+ | 1.0906 | 1.0 | 1002 | 0.9715 | 0.0107 |
73
+ | 0.9922 | 2.0 | 2004 | 0.9390 | 0.0108 |
74
+ | 0.9325 | 3.0 | 3006 | 0.9233 | 0.0113 |
75
+ | 0.8936 | 4.0 | 4008 | 0.9134 | 0.0124 |
76
+ | 0.8769 | 4.9953 | 5005 | 0.9120 | 0.0132 |
77
 
78
 
79
  ### Framework versions
config.json CHANGED
@@ -1,13 +1,13 @@
1
  {
2
- "_name_or_path": "Salesforce/codet5-small",
3
  "architectures": [
4
  "T5ForConditionalGeneration"
5
  ],
6
  "bos_token_id": 1,
7
  "classifier_dropout": 0.0,
8
- "d_ff": 2048,
9
  "d_kv": 64,
10
- "d_model": 512,
11
  "decoder_start_token_id": 0,
12
  "dense_act_fn": "relu",
13
  "dropout_rate": 0.1,
@@ -26,9 +26,9 @@
26
  "layer_norm_epsilon": 1e-06,
27
  "model_type": "t5",
28
  "n_positions": 512,
29
- "num_decoder_layers": 6,
30
- "num_heads": 8,
31
- "num_layers": 6,
32
  "output_past": true,
33
  "pad_token_id": 0,
34
  "relative_attention_max_distance": 128,
 
1
  {
2
+ "_name_or_path": "Salesforce/codet5-base",
3
  "architectures": [
4
  "T5ForConditionalGeneration"
5
  ],
6
  "bos_token_id": 1,
7
  "classifier_dropout": 0.0,
8
+ "d_ff": 3072,
9
  "d_kv": 64,
10
+ "d_model": 768,
11
  "decoder_start_token_id": 0,
12
  "dense_act_fn": "relu",
13
  "dropout_rate": 0.1,
 
26
  "layer_norm_epsilon": 1e-06,
27
  "model_type": "t5",
28
  "n_positions": 512,
29
+ "num_decoder_layers": 12,
30
+ "num_heads": 12,
31
+ "num_layers": 12,
32
  "output_past": true,
33
  "pad_token_id": 0,
34
  "relative_attention_max_distance": 128,
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:96e6e7f07f24e59be74c6d559a79325413ed1d8aed5a419fa41906e70a5bf66d
3
- size 242037800
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d74d7f8df141d0f53996700750693771b6f8aedeed3981d0a9e4c922ce32460f
3
+ size 891638568
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3b37b21c03d518303a92294b07f5807779171f7cb7ed6a230c85043641d48951
3
  size 5496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:353600229a52d1bc40f4e2190c68b4e1e545a3595edbe90c69597e2cc3abf322
3
  size 5496