Akash Singh
commited on
Commit
·
9b3c2fe
1
Parent(s):
924a043
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,77 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
## bert-base-uncased finetuned on IMDB dataset
|
2 |
+
|
3 |
+
Evaluation set was created by taking 1000 samples from test set
|
4 |
+
|
5 |
+
```
|
6 |
+
DatasetDict({
|
7 |
+
train: Dataset({
|
8 |
+
features: ['text', 'label'],
|
9 |
+
num_rows: 25000
|
10 |
+
})
|
11 |
+
dev: Dataset({
|
12 |
+
features: ['text', 'label'],
|
13 |
+
num_rows: 1000
|
14 |
+
})
|
15 |
+
test: Dataset({
|
16 |
+
features: ['text', 'label'],
|
17 |
+
num_rows: 24000
|
18 |
+
})
|
19 |
+
})
|
20 |
+
```
|
21 |
+
|
22 |
+
Parameters
|
23 |
+
```
|
24 |
+
max_sequence_length = 128
|
25 |
+
batch_size = 32
|
26 |
+
eval_steps = 100
|
27 |
+
learning_rate=2e-05
|
28 |
+
num_train_epochs=5
|
29 |
+
early_stopping_patience = 10
|
30 |
+
```
|
31 |
+
|
32 |
+
## Training Run
|
33 |
+
```
|
34 |
+
[2700/3910 1:11:43 < 32:09, 0.63 it/s, Epoch 3/5]
|
35 |
+
Step Training Loss Validation Loss Accuracy Precision Recall F1 Runtime Samples Per Second
|
36 |
+
100 No log 0.371974 0.845000 0.798942 0.917004 0.853911 15.256900 65.544000
|
37 |
+
200 No log 0.349631 0.850000 0.873913 0.813765 0.842767 15.288600 65.408000
|
38 |
+
300 No log 0.359376 0.845000 0.869281 0.807692 0.837356 15.303900 65.343000
|
39 |
+
400 No log 0.307613 0.870000 0.851351 0.892713 0.871542 15.358400 65.111000
|
40 |
+
500 0.364500 0.309362 0.856000 0.807018 0.931174 0.864662 15.326100 65.248000
|
41 |
+
600 0.364500 0.302709 0.867000 0.881607 0.844130 0.862461 15.324400 65.255000
|
42 |
+
700 0.364500 0.300102 0.871000 0.894168 0.838057 0.865204 15.474900 64.621000
|
43 |
+
800 0.364500 0.383784 0.866000 0.833333 0.910931 0.870406 15.380100 65.019000
|
44 |
+
900 0.364500 0.309934 0.874000 0.881743 0.860324 0.870902 15.358900 65.109000
|
45 |
+
1000 0.254600 0.332236 0.872000 0.894397 0.840081 0.866388 15.442700 64.756000
|
46 |
+
1100 0.254600 0.330807 0.871000 0.877847 0.858300 0.867963 15.410900 64.889000
|
47 |
+
1200 0.254600 0.352724 0.872000 0.925581 0.805668 0.861472 15.272800 65.476000
|
48 |
+
1300 0.254600 0.278529 0.881000 0.891441 0.864372 0.877698 15.408200 64.900000
|
49 |
+
1400 0.254600 0.291371 0.878000 0.854962 0.906883 0.880157 15.427400 64.820000
|
50 |
+
1500 0.208400 0.324827 0.869000 0.904232 0.821862 0.861082 15.338600 65.195000
|
51 |
+
1600 0.208400 0.377024 0.884000 0.898734 0.862348 0.880165 15.414500 64.874000
|
52 |
+
1700 0.208400 0.375274 0.885000 0.881288 0.886640 0.883956 15.367200 65.073000
|
53 |
+
1800 0.208400 0.378904 0.880000 0.877016 0.880567 0.878788 15.363900 65.088000
|
54 |
+
1900 0.208400 0.410517 0.874000 0.866534 0.880567 0.873494 15.324700 65.254000
|
55 |
+
2000 0.130800 0.404030 0.876000 0.888655 0.856275 0.872165 15.414200 64.875000
|
56 |
+
2100 0.130800 0.390763 0.883000 0.882353 0.880567 0.881459 15.341500 65.183000
|
57 |
+
2200 0.130800 0.417967 0.880000 0.875502 0.882591 0.879032 15.351300 65.141000
|
58 |
+
2300 0.130800 0.390974 0.883000 0.898520 0.860324 0.879007 15.396100 64.952000
|
59 |
+
2400 0.130800 0.479739 0.874000 0.856589 0.894737 0.875248 15.460500 64.681000
|
60 |
+
2500 0.098400 0.473215 0.875000 0.883576 0.860324 0.871795 15.392200 64.968000
|
61 |
+
2600 0.098400 0.532294 0.872000 0.889362 0.846154 0.867220 15.364100 65.087000
|
62 |
+
2700 0.098400 0.536664 0.881000 0.880325 0.878543 0.879433 15.351100 65.142000
|
63 |
+
|
64 |
+
TrainOutput(global_step=2700, training_loss=0.2004435383832013, metrics={'train_runtime': 4304.5331, 'train_samples_per_second': 0.908, 'total_flos': 7258763970957312, 'epoch': 3.45})
|
65 |
+
```
|
66 |
+
|
67 |
+
## Classification Report
|
68 |
+
```
|
69 |
+
precision recall f1-score support
|
70 |
+
|
71 |
+
0 0.90 0.87 0.89 11994
|
72 |
+
1 0.87 0.90 0.89 12006
|
73 |
+
|
74 |
+
accuracy 0.89 24000
|
75 |
+
macro avg 0.89 0.89 0.89 24000
|
76 |
+
weighted avg 0.89 0.89 0.89 24000
|
77 |
+
```
|