61347023S commited on
Commit
71d7182
·
verified ·
1 Parent(s): c4a022c

End of training

Browse files
Files changed (2) hide show
  1. README.md +85 -0
  2. pytorch_model.bin +1 -1
README.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: 61347023S/mDeBERTa-v3-base-xnli-multilingual-zeroshot-v1.1
3
+ tags:
4
+ - generated_from_trainer
5
+ metrics:
6
+ - accuracy
7
+ model-index:
8
+ - name: mDeBERTa-v3-base-xnli-multilingual-zeroshot-v1.2
9
+ results: []
10
+ ---
11
+
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
+
15
+ # mDeBERTa-v3-base-xnli-multilingual-zeroshot-v1.2
16
+
17
+ This model is a fine-tuned version of [61347023S/mDeBERTa-v3-base-xnli-multilingual-zeroshot-v1.1](https://huggingface.co/61347023S/mDeBERTa-v3-base-xnli-multilingual-zeroshot-v1.1) on the None dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 0.7018
20
+ - F1 Macro: 0.8729
21
+ - F1 Micro: 0.8745
22
+ - Accuracy Balanced: 0.8716
23
+ - Accuracy: 0.8745
24
+ - Precision Macro: 0.8748
25
+ - Recall Macro: 0.8716
26
+ - Precision Micro: 0.8745
27
+ - Recall Micro: 0.8745
28
+
29
+ ## Model description
30
+
31
+ More information needed
32
+
33
+ ## Intended uses & limitations
34
+
35
+ More information needed
36
+
37
+ ## Training and evaluation data
38
+
39
+ More information needed
40
+
41
+ ## Training procedure
42
+
43
+ ### Training hyperparameters
44
+
45
+ The following hyperparameters were used during training:
46
+ - learning_rate: 2e-05
47
+ - train_batch_size: 16
48
+ - eval_batch_size: 128
49
+ - seed: 35
50
+ - gradient_accumulation_steps: 2
51
+ - total_train_batch_size: 32
52
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
53
+ - lr_scheduler_type: linear
54
+ - lr_scheduler_warmup_ratio: 0.06
55
+ - num_epochs: 3
56
+
57
+ ### Training results
58
+
59
+ | Training Loss | Epoch | Step | Validation Loss | F1 Macro | F1 Micro | Accuracy Balanced | Accuracy | Precision Macro | Recall Macro | Precision Micro | Recall Micro |
60
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------------:|:--------:|:---------------:|:------------:|:---------------:|:------------:|
61
+ | 0.0962 | 0.17 | 200 | 0.5626 | 0.8709 | 0.8719 | 0.8714 | 0.8719 | 0.8704 | 0.8714 | 0.8719 | 0.8719 |
62
+ | 0.1218 | 0.34 | 400 | 0.5258 | 0.8635 | 0.8650 | 0.8629 | 0.8650 | 0.8643 | 0.8629 | 0.8650 | 0.8650 |
63
+ | 0.1246 | 0.51 | 600 | 0.4964 | 0.8652 | 0.8671 | 0.8635 | 0.8671 | 0.8679 | 0.8635 | 0.8671 | 0.8671 |
64
+ | 0.1225 | 0.68 | 800 | 0.5676 | 0.8618 | 0.8629 | 0.8623 | 0.8629 | 0.8614 | 0.8623 | 0.8629 | 0.8629 |
65
+ | 0.1429 | 0.85 | 1000 | 0.4402 | 0.8651 | 0.8666 | 0.8643 | 0.8666 | 0.8660 | 0.8643 | 0.8666 | 0.8666 |
66
+ | 0.1129 | 1.02 | 1200 | 0.5230 | 0.8688 | 0.8703 | 0.8679 | 0.8703 | 0.8699 | 0.8679 | 0.8703 | 0.8703 |
67
+ | 0.0921 | 1.19 | 1400 | 0.6435 | 0.8503 | 0.8534 | 0.8473 | 0.8534 | 0.8574 | 0.8473 | 0.8534 | 0.8534 |
68
+ | 0.0972 | 1.35 | 1600 | 0.5313 | 0.8635 | 0.8650 | 0.8628 | 0.8650 | 0.8644 | 0.8628 | 0.8650 | 0.8650 |
69
+ | 0.0883 | 1.52 | 1800 | 0.6088 | 0.8682 | 0.8692 | 0.8688 | 0.8692 | 0.8678 | 0.8688 | 0.8692 | 0.8692 |
70
+ | 0.0985 | 1.69 | 2000 | 0.5890 | 0.8696 | 0.8708 | 0.8693 | 0.8708 | 0.8698 | 0.8693 | 0.8708 | 0.8708 |
71
+ | 0.0838 | 1.86 | 2200 | 0.6647 | 0.8634 | 0.8650 | 0.8626 | 0.8650 | 0.8645 | 0.8626 | 0.8650 | 0.8650 |
72
+ | 0.0703 | 2.03 | 2400 | 0.6527 | 0.8712 | 0.8729 | 0.8699 | 0.8729 | 0.8732 | 0.8699 | 0.8729 | 0.8729 |
73
+ | 0.0639 | 2.2 | 2600 | 0.6665 | 0.8695 | 0.8714 | 0.8680 | 0.8714 | 0.8720 | 0.8680 | 0.8714 | 0.8714 |
74
+ | 0.059 | 2.37 | 2800 | 0.7361 | 0.8668 | 0.8687 | 0.8650 | 0.8687 | 0.8696 | 0.8650 | 0.8687 | 0.8687 |
75
+ | 0.062 | 2.54 | 3000 | 0.6719 | 0.8742 | 0.8756 | 0.8735 | 0.8756 | 0.8751 | 0.8735 | 0.8756 | 0.8756 |
76
+ | 0.0419 | 2.71 | 3200 | 0.7057 | 0.8734 | 0.8751 | 0.8722 | 0.8751 | 0.8753 | 0.8722 | 0.8751 | 0.8751 |
77
+ | 0.0539 | 2.88 | 3400 | 0.7020 | 0.8728 | 0.8745 | 0.8713 | 0.8745 | 0.8751 | 0.8713 | 0.8745 | 0.8745 |
78
+
79
+
80
+ ### Framework versions
81
+
82
+ - Transformers 4.33.3
83
+ - Pytorch 2.5.1+cu121
84
+ - Datasets 2.14.7
85
+ - Tokenizers 0.13.3
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5e3beaf88f779c33e880da824558421298c49e74222740a010433a9213cf328b
3
  size 1115313586
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:27e065b34112aa29d67b9c5f34ed4d0035548d577961cadb94350d80cc1c2105
3
  size 1115313586