lvcalucioli commited on
Commit
89cba95
·
verified ·
1 Parent(s): 4c6e565

ca-finetuned-flan-t5-large

Browse files
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: google/flan-t5-large
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - accuracy
8
+ model-index:
9
+ - name: ca-finetuned-flan-t5-large
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # ca-finetuned-flan-t5-large
17
+
18
+ This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 5.3915
21
+ - Accuracy: 0.0667
22
+
23
+ ## Model description
24
+
25
+ More information needed
26
+
27
+ ## Intended uses & limitations
28
+
29
+ More information needed
30
+
31
+ ## Training and evaluation data
32
+
33
+ More information needed
34
+
35
+ ## Training procedure
36
+
37
+ ### Training hyperparameters
38
+
39
+ The following hyperparameters were used during training:
40
+ - learning_rate: 0.0003
41
+ - train_batch_size: 4
42
+ - eval_batch_size: 4
43
+ - seed: 42
44
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
+ - lr_scheduler_type: linear
46
+ - num_epochs: 10
47
+
48
+ ### Training results
49
+
50
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy |
51
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|
52
+ | 3.3712 | 1.0 | 23 | 2.8727 | 0.0028 |
53
+ | 1.8522 | 2.0 | 46 | 3.0376 | 0.0333 |
54
+ | 1.0511 | 3.0 | 69 | 3.3480 | 0.0528 |
55
+ | 0.6972 | 4.0 | 92 | 3.7006 | 0.0417 |
56
+ | 0.5066 | 5.0 | 115 | 3.8307 | 0.0556 |
57
+ | 0.3017 | 6.0 | 138 | 4.5624 | 0.0611 |
58
+ | 0.1929 | 7.0 | 161 | 4.6136 | 0.0694 |
59
+ | 0.1591 | 8.0 | 184 | 4.8349 | 0.0611 |
60
+ | 0.1061 | 9.0 | 207 | 5.0931 | 0.0611 |
61
+ | 0.0638 | 10.0 | 230 | 5.3915 | 0.0667 |
62
+
63
+
64
+ ### Framework versions
65
+
66
+ - Transformers 4.38.0.dev0
67
+ - Pytorch 2.0.1+cu117
68
+ - Datasets 2.16.1
69
+ - Tokenizers 0.15.2
config.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "google/flan-t5-large",
3
+ "architectures": [
4
+ "T5ForConditionalGeneration"
5
+ ],
6
+ "classifier_dropout": 0.0,
7
+ "d_ff": 2816,
8
+ "d_kv": 64,
9
+ "d_model": 1024,
10
+ "decoder_start_token_id": 0,
11
+ "dense_act_fn": "gelu_new",
12
+ "dropout_rate": 0.1,
13
+ "eos_token_id": 1,
14
+ "feed_forward_proj": "gated-gelu",
15
+ "initializer_factor": 1.0,
16
+ "is_encoder_decoder": true,
17
+ "is_gated_act": true,
18
+ "layer_norm_epsilon": 1e-06,
19
+ "model_type": "t5",
20
+ "n_positions": 512,
21
+ "num_decoder_layers": 24,
22
+ "num_heads": 16,
23
+ "num_layers": 24,
24
+ "output_past": true,
25
+ "pad_token_id": 0,
26
+ "relative_attention_max_distance": 128,
27
+ "relative_attention_num_buckets": 32,
28
+ "tie_word_embeddings": false,
29
+ "torch_dtype": "float32",
30
+ "transformers_version": "4.38.0.dev0",
31
+ "use_cache": true,
32
+ "vocab_size": 32128
33
+ }
emissions.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2
+ 2024-02-15T16:22:42,35663134-39f3-46b0-8561-a98874cd327b,codecarbon,425.9083001613617,0.04518910453572998,0.08451535760263451,Canada,CAN,,N,,
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "decoder_start_token_id": 0,
3
+ "eos_token_id": 1,
4
+ "pad_token_id": 0,
5
+ "transformers_version": "4.38.0.dev0"
6
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f543f307bdea6785a50649f02aa2147539d6b41d42b601ec71b124eee62a879f
3
+ size 3132668808
runs/Feb15_15-32-18_c0c8f2aaa2e2/events.out.tfevents.1708011141.c0c8f2aaa2e2.36424.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e6d66f21933dbcdb104a25c208a12ca5dca28cd3f475ca0d0b2c7ba3c2e1d6a5
3
+ size 4708
runs/Feb15_15-38-33_c0c8f2aaa2e2/events.out.tfevents.1708011515.c0c8f2aaa2e2.37349.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d3cbdcde0b7373d61969455423829c2d02f4c6a7cbf0291511ffc3197ba7a9ad
3
+ size 4708
runs/Feb15_15-39-10_c0c8f2aaa2e2/events.out.tfevents.1708011552.c0c8f2aaa2e2.37553.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec8ecf7bb145a88c6e99acb2c103c6eccfd2e888ed85c8064059dd1f5c2a09c5
3
+ size 4708
runs/Feb15_15-42-28_c0c8f2aaa2e2/events.out.tfevents.1708011751.c0c8f2aaa2e2.37859.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1a5e672d8d293751aea67ee4ef2518004c188133b2720aa63dd4dab7daf5ab4
3
+ size 4708
runs/Feb15_15-48-01_c0c8f2aaa2e2/events.out.tfevents.1708012084.c0c8f2aaa2e2.38411.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d81c9fa69f84f19b23a5e573e2ee1fe2a8e7e50330042c8bb10372d10973cae6
3
+ size 5976
runs/Feb15_16-15-31_c0c8f2aaa2e2/events.out.tfevents.1708013734.c0c8f2aaa2e2.38769.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5017517ee1995b6285cd3de25f7bb6f07f296fa1b01d1947cf9bb508ae5dce44
3
+ size 9817
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3fe05791115b36deb55925d39b4eec88a572aba6c8294383f76e6cbf3309223b
3
+ size 4539